Patent Application Number: 2007243344
Track the progress of public participation in the review of this pending patent application, and view
application details. The menu on the right will help you navigate this patent application. Subscribe to
the community enables you to receive updates on this application via email so that you can easly follow recent activity.
LATEST PRIOR ART
| Date | Title | Reviewer |
|---|---|---|
| 03/30/10 | Redundancy Elimination Within Large Collections of Files | Jimmy Ti |
DISCUSSION
Steven Pearson (over 2 years ago)
See also US Patent Application https://www.delphion.com/details?pn=US27250519A1Mik Clarke (over 2 years ago)
Microsoft filed application that explains Differential Data Compression a bit:
http://www.faqs.org/patents/app/20090259675
Not prior-art as, judging by the sequence number, it appears to have been filed in 2009.
Seems to be a distinct approach from TSM & NDMP or Data Mirroring. The cell management also seems different from SAN and RAID approaches - mostly due to the differential data compression.
I suspect it'll work a lot better if things aren't compressed before they are stored, as compression increases the entropy of the data (making it less likely to be duplicated in other data). Reminds me a lot of the silly idea that was floating around a few years ago that everything could be stored as an offset and a length into a expansion of PI.Jimmy Ti (over 2 years ago)
Now on to part 2.
The application also talks about the data storage system to be differential, i.e. duplicated data will not be stored, hence eliminating redundant information and provide better storage space efficiency.
The application suggests by storing data with duplicated content to the same Storage Components or a single cluster of Storage Components. This is done so when the data is retrieved, all information could be gathered locally.
Clustering all data with duplicated content can introduce bottlenecks to a data storage system as certain Storage Component are heavily loaded while others are hardly touched. The application then talks about various algorithms and metrics for determining how to even out the load in the system while maintaining high compression efficiency. To do this, they talked about using a hash-based data similarity test and also the concept of query-based data routing, in which each Storage Component checks and compares the similarity of incoming data to the data it currently stored in order for the system to best determine where to store the data.PEER TO PATENT ACTIVITY
All
Discuss Patent Applications
14 comments posted
Size of Community: 5
14 comments posted
Size of Community: 5
Upload + Explain Prior Art
1 submitted
1 submitted
Research Prior Art
0 research notes
0 research notes
Annotate and Evaluate Prior Art
0 prior art ratings
1 citations
0 prior art ratings
1 citations
WHAT IS THIS APPLICATION ABOUT
0 days left
















United States