Search: The New Frontier in Data Management
I can’t remember when I stopped organizing my emails and file folders. But I do remember why I stopped. At some point, I came to the realization that, instead of spending my precious time meticulously archiving my documents in sub-directories (and sub-sub directories and sub-sub-sub-categories), I could rely on Windows’ search function to locate what I needed, when I needed.
To clarify, I still save documents by loosely defined categories: Work, Family, Friends, Finances, and so on. But I no longer obsess over micro-categories, like interview notes from meeting with mobile app developers at Macworld 2010.
In the past (say, 10 years ago), it was important for me to micro-manage my documents at that level because I had no easy way to locate a piece of information pertaining to a certain project, created on a certain date. I had to impose an artificial structure on my data pile because personal computers weren’t sophisticated enough to discern and handle unstructured data.
Today, my email client (happens to be Windows Live mail) can locate a cookie recipe from 2004 buried in my inbox faster than I can, because the software, running on modern CPUs, can peruse many more documents in a second than I can with my human eyes.
Generation Google: Welcome to the Era of Integrated Search Engines!
Seek and You Shall Find, Perhaps More Than You Need
This month, as Oracle users headed home from their annual gathering (Oracle OpenWorld 2011, Oct. 2 to 6), they learned that Oracle was about to buy Endeca, an enterprise search engine. In a letter to customers, Thomas Kurian, Oracle’s executive VP of development, wrote, “The explosion of data variety and volume, including enterprise content and application data, social media, sensor data and third-party feeds, has changed the way that companies and consumers interact and how businesses want to use this information … Endeca’s leading unstructured data management engine, web commerce and business intelligence applications help enterprises improve decisions and deliver a superior customer experience … The combination of Oracle and Endeca is expected to create a comprehensive technology platform to process, store, manage, search, and analyze structured and unstructured information together.”
Oracle’s pending purchase of Endeca echoes another acquisition that took place in June 2010: Dassault Systemes‘ acquisition of Exalead, an enterprise search engine (some described it as the French Google). Announcing the transaction to the press, Dassault CEO Bernard Charlès said, “With Exalead and its partners, we can provide a new class of search-based applications for collaborative communities.”
In the past, engineers bemoaned the lack of data. Not enough information on sustainable materials, not enough paper trail for compliance, not enough information on seismic activities at a construction site — these were the headaches from a bygone era. Today, the challenge is not a lack of data; it’s having too much data.
Information Week warned, “Security systems generate an overload of information,” therefore we’ll need “New tools [to] help manage it all more effectively” (“Data Deluge,” Aug 19, 2002). Los Angeles Times joined the discussion with an article called “Pondering effects of the data deluge,” (July 7, 2011). Similarly, The Economist pointed out “Plucking the diamond from the waste” would be the new challenge for businesses (“Data Deluge,” Feb 25, 2010).
Newcomers: Inforbix; Alcove9
In manufacturing, newcomers like Inforbix (cofounded by PLM blogger Oleg Shilovitsky, Beyond PLM) and Alcove9 are tackling the data deluge with search technologies. Whereas traditional data management and product lifecycle management software tends to cover supply chain, change orders, revisions, compliance, and collaboration, Inforbix and Alcove9 focus keenly on search and retrieval as their core offerings. They both take a similar approach to scan, index, and remember the locations and attributes of clients’ files and documents, thereby enabling their software to respond to user queries quicker. (Both Inforbix and Alcove9 describe their software interfaces as “Google-like.”)
Inforbix offers a wizard, a small executable applet downloadable from the company’s site. When launching the wizard (dubbed Product Data Crawler), you’ll be prompted to identify the data repositories you’d like to index. Inforbix then scans (or crawl, as programmers like to say) your file folders and directories, retrieve the meta data (author, last change date, approval info, and other attributes) to a cloud-hosted server. The process allows Inforbix users to subsequently launch Inforbix from a standard browser and perform searches to locate and use their data.
Inforbix is not a remote data storage (like Dropbox). The information stored in Inforbix’s cloud server is strictly confined to data about your data (attributes from your CAD files and Office documents). It does not create duplicate copies of your data in the cloud (your CAD files and Office documents do not get uploaded to Inforbix’s cloud).
Currently, Inforbix offers two modules: xSearch and xTable. xSearch lets you use a search window to find and locate the items you need from your indexed data sources. It offers a series of filters to narrow your findings (by date, by author name, by last modified, and so on). xTables lets you export your search results in the form of an Excel table, with active links to your data source. (If you click on the thumbnail of SolidWorks part, for instance, you’ll be automatically launching the CAD file in the authoring software.) The value of xTables, as the company points out, is that the information is always synchronized and updated by its connection to the cloud.
For instance, if you have saved an xTable search results for all AutoCAD files created by drafter John, approved by supervisor Carl, the next time you launch the table, the results will be updated to reflect new AutoCAD files created by the same drafter, approved by the same supervisor.
Alcove9 offers its products through a series of subscription plans (free, Gold, and Platinum). The centerpiece of Alcove9’s suite is the A9 Hub, an open source software to index your data. To perform searches, you’ll use a standard browser. The hub is complemented by a series of AppConnect modules, add-ons that link A9 Hub to PLM, ERP, and other programs. There is also an AppConnect for CAD visualization, which functions as a viewing and markup app for those who need to visualize, approve, inspect, and annotate CAD files but don’t necessarily need to perform CAD modeling.
Since data search and retrieval involve no overly taxing computing demands on CPUs, the function seems ideally suited to lightweight, portable mobile devices. Inforbix is currently developing mobile apps to allow Apple and Android device users to perform similar functions from their portable devices. Alcove9 FAQ states it’s “pursuing this capability [mobile device support] for future releases.”