Quantcast

[RFC] dedup patterns

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[RFC] dedup patterns

Yuval Kashtan
Hello,
The attached patch adds 2 new patterns 'Dedup Random' and 'Dedup Progressive'.
The two new modes are needed when evaluating CAS (Content Aware Storage), where we need to produce a certain de-duplication ratio.
The ratio is also controllable through the UI.

Dedup Random -
data is randomized out of space of disk size / dedup ratio.
in this case we get to the exact needed ratio, but only after a few passes over the disk.
Dedup Progressive -
Each totally random data is written dedup ratio times to the disk. In this case we start with the exact specified ratio but at the end it gets lower.

The patch also include a previous RFC I've sent - which makes the DATA_PATTERN_FULL_RANDOM
to really be full random.
for fastness and compatibility I've used mt (Mersenne Twister) extracted from the gsl (GNU Scientific Library)
This mode is also relevant for CAS where existing FULL_RANDOM mode can actually produce only up to 64Gb of random data. (for 4kb block systems.)

Please consider adding these modes to IOMeter.

Sincerely,
Yuval Kashtan

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Iometer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/iometer-devel

dedup.patch (57K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC] dedup patterns

Daniel Scheibli-2

Hi Yuval,

thank you very much for the patch contribution; it is highly appreciated.

Right now we are wrapping stuff to get version 1.1.0 out of the door,
so this is more a candidate for version 1.2.0 then.

Without having looked into the code yet, the mention of the gsl in your
description gives me some headache given Iometer being licensed under
the IOSL (basically the BSD license).

Best Regards,
Daniel


On 05/15/2012 10:58 AM, Yuval Kashtan wrote:

> Hello,
> The attached patch adds 2 new patterns 'Dedup Random' and 'Dedup Progressive'.
> The two new modes are needed when evaluating CAS (Content Aware Storage), where we need to produce a certain de-duplication ratio.
> The ratio is also controllable through the UI.
>
> Dedup Random -
> data is randomized out of space of disk size / dedup ratio.
> in this case we get to the exact needed ratio, but only after a few passes over the disk.
> Dedup Progressive -
> Each totally random data is written dedup ratio times to the disk. In this case we start with the exact specified ratio but at the end it gets lower.
>
> The patch also include a previous RFC I've sent - which makes the DATA_PATTERN_FULL_RANDOM
> to really be full random.
> for fastness and compatibility I've used mt (Mersenne Twister) extracted from the gsl (GNU Scientific Library)
> This mode is also relevant for CAS where existing FULL_RANDOM mode can actually produce only up to 64Gb of random data. (for 4kb block systems.)
>
> Please consider adding these modes to IOMeter.
>
> Sincerely,
> Yuval Kashtan
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
>
> _______________________________________________
> Iometer-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/iometer-devel


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Iometer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/iometer-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC] dedup patterns

Yuval Kashtan
I can easily switch to any other implementation. (I'll look for BSD one)

I've been efficiently using mt in other projects as well, for the sake of good portability (and good randomness),
it seems that the existing random functions are either not thread safe or inconsistent across systems (not portable)

When is 1.1.0 scheduled for?
Will you consider adding the two new modes to that?
alternatively when will 1.2.0 be out?


Sincerely,
Yuval Kashtan


On Wed, May 16, 2012 at 11:37 AM, Daniel Scheibli <[hidden email]> wrote:

Hi Yuval,

thank you very much for the patch contribution; it is highly appreciated.

Right now we are wrapping stuff to get version 1.1.0 out of the door,
so this is more a candidate for version 1.2.0 then.

Without having looked into the code yet, the mention of the gsl in your
description gives me some headache given Iometer being licensed under
the IOSL (basically the BSD license).

Best Regards,
Daniel


On 05/15/2012 10:58 AM, Yuval Kashtan wrote:
> Hello,
> The attached patch adds 2 new patterns 'Dedup Random' and 'Dedup Progressive'.
> The two new modes are needed when evaluating CAS (Content Aware Storage), where we need to produce a certain de-duplication ratio.
> The ratio is also controllable through the UI.
>
> Dedup Random -
> data is randomized out of space of disk size / dedup ratio.
> in this case we get to the exact needed ratio, but only after a few passes over the disk.
> Dedup Progressive -
> Each totally random data is written dedup ratio times to the disk. In this case we start with the exact specified ratio but at the end it gets lower.
>
> The patch also include a previous RFC I've sent - which makes the DATA_PATTERN_FULL_RANDOM
> to really be full random.
> for fastness and compatibility I've used mt (Mersenne Twister) extracted from the gsl (GNU Scientific Library)
> This mode is also relevant for CAS where existing FULL_RANDOM mode can actually produce only up to 64Gb of random data. (for 4kb block systems.)
>
> Please consider adding these modes to IOMeter.
>
> Sincerely,
> Yuval Kashtan
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
>
> _______________________________________________
> Iometer-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/iometer-devel



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Iometer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/iometer-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC] dedup patterns

Daniel Scheibli-2

> I can easily switch to any other implementation.
> (I'll look for BSD one)

Ok


> When is 1.1.0 scheduled for?

Roughly speaking in the coming weeks.


> Will you consider adding the two new modes to that?

No. Version 1.1.0 scope is closed,
only two more bug fixes are going in.


> alternatively when will 1.2.0 be out?

No plans yet. Current focus is on getting
1.1.0 released. Though a first guess for
official release of 1.2.0 might be end of
this year.



> On Wed, May 16, 2012 at 11:37 AM, Daniel Scheibli
> <[hidden email]>wrote:
>
>>
>> Hi Yuval,
>>
>> thank you very much for the patch contribution; it is highly
>> appreciated.
>
>> Right now we are wrapping stuff to get version 1.1.0 out of the
>> door,
>> so this is more a candidate for version 1.2.0 then.
>>
>> Without having looked into the code yet, the mention of the gsl in
>> your
>> description gives me some headache given Iometer being licensed
>> under
>> the IOSL (basically the BSD license).
>>
>> Best Regards,
>> Daniel
>>
>>
>> On 05/15/2012 10:58 AM, Yuval Kashtan wrote:
>> > Hello,
>> > The attached patch adds 2 new patterns 'Dedup Random' and 'Dedup
>> Progressive'.
>> > The two new modes are needed when evaluating CAS (Content Aware
>> Storage), where we need to produce a certain de-duplication ratio.
>> > The ratio is also controllable through the UI.
>> >
>> > Dedup Random -
>> > data is randomized out of space of disk size / dedup ratio.
>> > in this case we get to the exact needed ratio, but only after a
>> few
>> passes over the disk.
>> > Dedup Progressive -
>> > Each totally random data is written dedup ratio times to the disk.
>> In
>> this case we start with the exact specified ratio but at the end it
>> gets
>> lower.
>> >
>> > The patch also include a previous RFC I've sent - which makes the
>> DATA_PATTERN_FULL_RANDOM
>> > to really be full random.
>> > for fastness and compatibility I've used mt (Mersenne Twister)
>> extracted
>> from the gsl (GNU Scientific Library)
>> > This mode is also relevant for CAS where existing FULL_RANDOM mode
>> can
>> actually produce only up to 64Gb of random data. (for 4kb block
>> systems.)
>> >
>> > Please consider adding these modes to IOMeter.
>> >
>> > Sincerely,
>> > Yuval Kashtan
>> >
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Live Security Virtual Conference
>> > Exclusive live event will cover all the ways today's security and
>> > threat landscape has changed and how IT managers can respond.
>> Discussions
>> > will include endpoint security, mobile security and the latest in
>> malware
>> > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> >
>> >
>> >
>> > _______________________________________________
>> > Iometer-devel mailing list
>> > [hidden email]
>> > https://lists.sourceforge.net/lists/listinfo/iometer-devel
>>
>>
>



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Iometer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/iometer-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC] dedup patterns

Yuval Kashtan
In reply to this post by Daniel Scheibli-2
Here's a new patch, with mtwist.c which is free for use

Sincerely,
Yuval Kashtan


On Wed, May 16, 2012 at 11:37 AM, Daniel Scheibli <[hidden email]> wrote:

Hi Yuval,

thank you very much for the patch contribution; it is highly appreciated.

Right now we are wrapping stuff to get version 1.1.0 out of the door,
so this is more a candidate for version 1.2.0 then.

Without having looked into the code yet, the mention of the gsl in your
description gives me some headache given Iometer being licensed under
the IOSL (basically the BSD license).

Best Regards,
Daniel


On 05/15/2012 10:58 AM, Yuval Kashtan wrote:
> Hello,
> The attached patch adds 2 new patterns 'Dedup Random' and 'Dedup Progressive'.
> The two new modes are needed when evaluating CAS (Content Aware Storage), where we need to produce a certain de-duplication ratio.
> The ratio is also controllable through the UI.
>
> Dedup Random -
> data is randomized out of space of disk size / dedup ratio.
> in this case we get to the exact needed ratio, but only after a few passes over the disk.
> Dedup Progressive -
> Each totally random data is written dedup ratio times to the disk. In this case we start with the exact specified ratio but at the end it gets lower.
>
> The patch also include a previous RFC I've sent - which makes the DATA_PATTERN_FULL_RANDOM
> to really be full random.
> for fastness and compatibility I've used mt (Mersenne Twister) extracted from the gsl (GNU Scientific Library)
> This mode is also relevant for CAS where existing FULL_RANDOM mode can actually produce only up to 64Gb of random data. (for 4kb block systems.)
>
> Please consider adding these modes to IOMeter.
>
> Sincerely,
> Yuval Kashtan
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
>
> _______________________________________________
> Iometer-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/iometer-devel



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Iometer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/iometer-devel

dedup-mtwist.patch (134K) Download Attachment
Loading...