Intelligent Systems: Case Studies

Case Studies

Cole-Parmer
Bio-Rad
TicketsNow
Orbitz
Sun Microsystems
Wells Fargo
Ameritech
AHA
NEC
Coca-Cola
GM
Company
Contact Us

Case Studies

The following is a sample of some of the technologies and innovative solutions Intelligent Systems has provided for its clients.

Cole-Parmer

Cole-Parmer, a laboratory and scientific equipment provider, uses our Recommendation Engine service to provide on-demand product recommendations on its website's online catalog. The recommendations are provided through a Software as a Service (SaaS) model. The recommendations are generated using a Collaborative Filtering algorithm which generates additional product recommendations for each product page in the catalog based upon analyzing the purchase history of previous customers who purchased this product and computing correlations between products. Recommendation are computed on our cloud-based servers based upon order history uploaded to our servers daily and provided to the client website via a JSON Web Service API. The service is licensed through a revenue sharing model which, along with the SaaS model, reduces financial risk for the client and eliminates the up front development costs and complexity.

Bio-Rad

Intelligent Systems developed RDF repository to manage metadata, content, and the migration process for content migration and new CMS deployment for laboratory and scientific equipment provider. This repository built upon a proprietary Intelligent Systems RDF repository tool called Metador, which was customized for this migration project. Metador itself is configured and its behavior is controlled via RDF stored in the repository, making it well suited for customization to the needs of a particular project such as this one. In addition to storing and authoring metadata for the migration and eventual CMS, Metador was used to manage and execute migration tasks using a plugin task architecture, also implemented using RDF. Finally, Metador was customized to play the role of an interim CMS for several months to fill a gap resulting from unexpected delays in deployment of the target CMS (Interwoven).

Bio-Rad

Developed semantic crowdsourcing architecture for scalable knowledge acquisition in order to efficiently collect large amounts of new metadata during the migration. This architecture was implemented on top of the Metador RDF repository architecture described above. This approach built upon many of the ideas from our earlier crowdsourcing work at TicketsNow, but made much more extensive use of RDF to represent the acquired metadata, questions, answers, and other crowdsourcing meta-knowledge, and the RDF repository to store and manage this knowledge. It also incorporated the crowdsourcing UI into the tool itself rather than using Mechanical Turk and leveraged the Bio-Rad content authors and business experts to perform the knowledge acquisition rather than external users such as the Mechanical Turk community. In both cases, a key driver is making the difficult task of knowledge acquisition more efficient and scalable.

Eventually, many of the insights and technology of each of these crowdsourcing efforts were incorporated into a new tool called Crowd Sorcerer, which is explicitly designed to combine the power of a semantic repository and crowdsourcing to achieve highly scalable knowledge acquisition, whether it is performed internally using the Crowd Sorcerer's crowdsourcing UI or leveraging external users and customers using Mechanical Turk integration or a customer facing website. This is described in more detail in the Crowdsourcing technology section.

Bio-Rad

Developed adaptive search engine for laboratory equipment provider. This search engine analyzes search history to determine the actual search terms that users have used in previous searches and which product they selected from the resulting search results, in order to learn the relationship between search terms and target products. In this way, we were able to learn how customers think about products in their own words and return the exact items that they are looking for when they use such queries. Furthermore, such an adaptive approach improves as it is used and can learn new associations as new products are added.

This approach achieved significantly higher search accuracy in a head-to-head comparison of search accuracy with a well known commercial "conceptual" search engine previously used on the website. Even better results were achieved using a hybrid approach which combined the adaptive approach with the traditional search engine, incorporating search history optimization when this data was available and improved results. This approach built upon the ideas from our earlier adaptive search work at TicketsNow, but in this case we were able to do a more quantitative evaluation and comparison against existing commercial search engines. We were also able to test this approach on broader, more general purpose, internet search by applying the adaptive search technique to a sample of internet search history made available at the time by AOL. This approach to search is described in more detail in the Intelligent Search technology section.

TicketsNow (now part of Ticketmaster)

Developed an iterative semantic crowdsourcing knowledge acquisition strategy to build a taxonomy of music, theater, and sporting events for the events sold by an online ticket service, and to classify the individual events into the appropriate taxonomy category. Existing ticket inventory data was used to generate questions about categories of music, theater, and sporting events, and classify individual events into these categories using the AWS Mechanical Turk API. This API allows developers and knowledge engineers to ask specific questions of users in return for micro-payments and retrieve the answers.

In our strategy, several different questions and question formats were asked in an iterative fashion to incrementally build and refine the knowledge base. Questions ranged from broad questions such as naming 5 categories of music, to more specific questions such as classifying an individual performer or group into a category. In the earlier stages, questions were open ended, allowing a free text answer. In later stages, questions were of a more multiple choice style, allowing the user to select from a set of existing values or categories. This was performed iteratively, with the answers from the early stages used to construct more targeted questions in the later stages and incrementally build up the vocabulary and taxonomy. The same question was asked to multiple users to allow statistical techniques to compute averages, filter results ensuring quality, and collaboratively achieve consensus.

In the end, this process created a taxonomy of thousands of items. In previous projects, this sort of undertaking often took weeks or months at significant cost. This approach ended up being completed in a weekend for less than $100 in micro-payments, arguably achieving superior results since the resulting taxonomy and individual classifications reflected the consensus view of the mental models of a large set of individuals representative of actual customers.

TicketsNow (Division of Ticketmaster)

Developed adaptive search engine for online ticket service. This search engine analyzes search history to determine the actual search terms that users have used in previous searches and which item they selected from the resulting search results, in order to learn the relationship between search terms and target products. In this way, we were able to learn how customers think about products in their own words and return the exact items that they are looking for when they use such queries. Furthermore, such an adaptive approach improves as it is used and can learn new associations as new products are added. This approach was subsequently used in other ecommerce projects such as Bio-Rad where it compared favorably to commercial search engines.

TicketsNow (Division of Ticketmaster)

Developed recommendation engines for recommending tickets using multiple recommendation strategies. Developed collaborative filtering engine which recommends products (tickets in this case) based upon analyzing the purchase history of previous customers who purchased this product and computing correlations between products. Developed rule-based recommendation engine which recommends products base upon rules capturing business knowledge. Developed rule-based inference engine and RDF knowledge representation used by rules. Developed statistical model which combined several sources of evidence for recommendations, including revenue management considerations, which incorporate business goals such as optimizing revenue, profit, and managing inventory in addition to user preferences in recommending products.

Orbitz

Conducted Business Intelligence and Web Analytics initiative for Orbitz online travel business. Performed data mining studies on high volume click-stream data using statistical and machine learning tools. Developed executive dashboards to give executives timely and continual visibility into business through visual monitoring of key metrics. Developed interactive Web Analytics tool that allowed business analysts to explore the paths that users take through their website and visualize the statistical characteristics of these paths and the decision points which are most likely to lead (or not lead) to goals such as conversions.

Sun Microsystems

Developed RDF model and semantic search engine for Java API documentation (Javadoc) for Sun Microsystems. Developed parsers and related algorithms to parse Javadoc HTML documents into RDF model of Java classes and methods and XML content model (for descriptive text) and developed search engine to search RDF model for items which match both text and complex relationships (e.g. find methods which accept a String argument and return an object whose type is a subtype of class X. The search engine also allowed browsing the API via the RDF structure and relationships and exposed the RDF structure and search functions via a Web Service API.

Sun Microsystems

Developed RDF repository tool to manage metadata and migrated content in content migration project. Developed a variety of content extraction and content transformation algorithms. Developed a probabilistic tool to reverse engineer sitemap from link structure of legacy site.

Wells Fargo

Developed content migration tools for content migration project for large bank. Developed a tool called the Clipper to extract HTML page components from legacy site and use extracted content to train learning algorithm to generate extraction rules for automated extraction.

Ameritech/SBC (now AT&T)

Applied a number of AI and Machine Learning techniques to help Ameritech migrate a large set of legacy mainframe M&P and other raw text data to a new content management system supporting a new call center help application. These included applying conceptual clustering to automatically create a taxonomy, automatic document classification, automatic creation of keywords and other metadata (e.g. synonyms, abbreviations, related phrases) using statistical analysis of text, and pattern recognition to extract complex document structures (e.g. tables) from raw text data. Also created a search engine to allow call center agents to search content using fuzzy and imprecise terms.

Ameritech/SBC (now AT&T)

Created text mining tool to allow Ameritech product managers discover insights about customer experience and product issues hidden in the vast volumes of verbatim comment data collected daily in their call centers. This data contained valuable business intelligence which could not be previously exploited because of the vast amount and unstructured nature of the data. The tool analyzed the statistical properties of the text to surface significant issues and trends buried in call center call logs. Multiple patents were awarded for the underlying technology.

Ameritech/SBC (now AT&T)

Invented Predictive Text Entry algorithm which used a predictive model based on letter ngram statistics to predict letters based on previously entered text. This allowed efficient text entry in the early days of cell phones, prior to touch screens and on-screen (or off-screen) keyboards, when text messages needed to be entered via a slow laborious process on numeric telephone keypads. This technique also foreshadowed the auto-complete keyboards on current mobile devices. Multiple patents were awarded for this algorithm.

American Heart Association

Applied Machine Learning techniques to automate migration of content from a large number of dissimilar and mostly static member websites to a single common content management system. This included applying statistical pattern recognition algorithms to extract content components such as navigation bars, breadcrumbs, and body text from pages with widely varying formats. Because there was no consistent format, simple techniques such as XPath and regular expressions were not viable. Instead Intelligent Systems developed sophisticated pattern recognition algorithms which analyzed and matched the statistical patterns within the HTML content. These algorithms were trained on a small set of manually extracted content and then used to automatically extract content from the remaining pages. This approach resulted in automatic processing of thousands of pages with an accuracy in excess of 95%.

NEC

Developed content targeting and dynamic content platform for Digital Signage. Tool generated targeted content for digital signs in stores and public spaces. Content was targeted using collaborative filtering, statistical models, and rule-based targeting. Developed RDF model for representing content metadata and rule engine for for rule-based targeting.

Coca-Cola

Created rule-based content migration tool to automate content migration and support deployment of new AEM/CQ CMS for Coca-Cola. Rules transform nested objects and object relationships in legacy CMS to new object model used in target CMS, using JSON as intermediate language. Created new language called JPath to represent patterns in nested data elements and traverse references between data elements in the JSON representation of the source and target object models. Created Crosslinker tool to automatically insert links between content pages based upon rules.

Developed back end server software for GM Mobile website using Adobe Experience Manager (AEM)(formerly CQ) Content Management System. Developed product taxonomy, 3rd party data feed API integrations, and Build Your Own (BYO) product configurator. Investigated ways to make the team more effective by applying knowledge management techniques, including mining project and technical knowledge from project instant message history using Skype API.