Patent Application Number: 2007243344
This discussion forum is available to allow the community of reviewers to collaborate and discuss this patent application.
Discussion that is posted on this page will not be forwarded to IP Australia. If you wish to make comment on a prior art reference and wish that comment to be considered by the patent examiner, please do so by clicking on the relevant prior art reference.
Here are some tips to facilitate discussion.
- Flagging a post as an 'action item' signals that further research is required.
- Flagging SPAM and ABUSE helps to manage discussion
- Placing double brackets around a reference to a claim or prior art will create a hyperlink to the original, for example: [[claim 1]] and [[prior art 2]].

Discussion (14)
Show without Noise
0 days left





United States
http://portal.uspto.gov/external/portal/pair
you can see the references submitted in the US filing for this HP application (look for "information disclosure statement", there are multiple, under Image File Wrapper tab), as well as even some of the arguments back and forth between the US patent examiner and the applicant's representative (see for example "Non-Final Rejection", wherein, e.g., we find tha the examininer initially proposes that claim 1 is anticipated by US patent application 20040054700 (Okada)).
Also, a new rejection dated just yesterday shows the examiner seems to have stopped referring to Okada, but has a new prior art for claim 1. That is US Patent 7536291 (Retnamma, et al). This is apparently about a method for *simulating* various storage operations (such as in a sales context; their Fig. 4 pictures a person demoing on a laptop). This patent includes a notion of "cells", and one might look to see if this is a more general notion that subsumes the application's "logical bins". The patent discusses a lot of possible kinds of storage operations and may be a bit vague on details of them. It includes some reference to optimizing things, e.g.:
"Master storage manager 140 (or other network storage manager) may contain programming directed to analyzing the storage patterns and resources of its associated storage operation cells and which suggests optimal or alternate methods of performing storage operations."
..
"a storage operation cell may contain a data agent 95 which may generally be a software module that is responsible for performing storage operations "
..
"The preferences and storage criteria may include, but are not limited to, a storage location, relationships between system components, network pathway to utilize, retention policies, data characteristics, compression or encryption requirements, preferred system components to utilize in a storage operation, and other criteria relating to a storage operation. "
http://www.faqs.org/patents/app/20090259675
Not prior-art as, judging by the sequence number, it appears to have been filed in 2009.
Seems to be a distinct approach from TSM & NDMP or Data Mirroring. The cell management also seems different from SAN and RAID approaches - mostly due to the differential data compression.
I suspect it'll work a lot better if things aren't compressed before they are stored, as compression increases the entropy of the data (making it less likely to be duplicated in other data). Reminds me a lot of the silly idea that was floating around a few years ago that everything could be stored as an offset and a length into a expansion of PI.
I thought the PI expansion offset storage scheme is supposed to be an April's fool joke, but it is original nonetheless. :)
US Patents
7243186
6904430
US Patent Application Publications
20060184652
20060085561
20050256974
Non-Patent Literature (as of 10/25/2007)
ftp://ftp.research.microsoft.com/pub/tr/TR-2006-157.pdf
ftp://ftp.digital.com/pub/DEC/SRC/publications/broder/positano-final-wpnums.pdf
(see: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.779&rep=rep1&type=pdf)
http://research.microsoft.com/~gurevich/Opera/190.pdf
http://linuxdevices.com/articles/AT6850006074.html
http://www.itnews.com.au/News/62697,seagate-combines-flash-memory-hard-disk-for-notebook-storage.aspx
(see: http://www.informationweek.com/news/hardware/desktop/showArticle.jhtml?articleID=202400049)
The application also talks about the data storage system to be differential, i.e. duplicated data will not be stored, hence eliminating redundant information and provide better storage space efficiency.
The application suggests by storing data with duplicated content to the same Storage Components or a single cluster of Storage Components. This is done so when the data is retrieved, all information could be gathered locally.
Clustering all data with duplicated content can introduce bottlenecks to a data storage system as certain Storage Component are heavily loaded while others are hardly touched. The application then talks about various algorithms and metrics for determining how to even out the load in the system while maintaining high compression efficiency. To do this, they talked about using a hash-based data similarity test and also the concept of query-based data routing, in which each Storage Component checks and compares the similarity of incoming data to the data it currently stored in order for the system to best determine where to store the data.
1. Distributed differential data storage system
2. Data routing algorithms to enhance compression efficiency
Let's start by dissecting part 1.
- A distributed (geographically separated), networked data storage system that has a maximum of 4 layer of entities.
1st Layer: Client computers - Users who will save and retrieve data from the system
2nd Layer: Portals - An interface between the Client Computers and the Logical Bins. They receive requests from Client Computers and redirect them to the appropriate entities in the system.
3rd Layer: Logical Bins - Handles data access request and perform logical operations to determine which Storage Components should be used to access the files. This adds a layer of indirection between data request and actual data storage component. This is done in order to be able to easily add new data storage to the system, i.e. 1st and 2nd layer will not need to be aware of the new storage components (requests are still send to the same bin), we only need to reassign certain bins to the new store.
4th Layer: Storage Components - Computer systems that performs data backup and archival functions.