interests

A Hair, they say, divides the False and True,
Yes; and a single Alif were the clue,
Could you but find it--to the Treasure-house,
And peradventure to The Master too!

from Edward Fitzgerald's translation of the Rubaiyat of Omar Khayyam
(I first saw this in a book by the Mathematician John Conway, probably "On Numbers and Games".)

Interests: (massive) Learning Systems
(This was written in ~2007, with some updates since. Remains largely true, but requires some significant updating!)

Considerations of Natural Intelligence

I am interested in gaining insights into the working of our minds (a psychologist called it the "universe within") as well as building software artifacts inspired by some aspects of intelligence. I enjoy the analysis and more generally the research activity that goes with that.

For some time now, it has appeared to me (and many others) that the field of machine learning will be a key player in addressing the current challenges of artificial intelligence. Broadly speaking, learning means improving at some task with experience, while doing it. Machine learning (ML) is the study of computational methods that can account for or achieve various types of adaptation/learning. Why should learning be an important part of the answer to AI's problems? We can't be sure of course until it's all settled, hopefully at some point... and there may be several widely different solutions.. However, there is experience/evidence from decades of research that points to limits of explicitly programming intelligence, as well as the potentials of the learning approach. They include the following: Limits of human (accessible) knowledge: we don't know important details of our reasoning/thinking processes (say, how we arrive at the next chunk of thought), and we don't have access to all our "knowledge" or know-how. It is very hard to anticipate our reactions and actions in advance in various contexts. And the world and what needs to be done to behave intelligently in it, the way say typical humans do, in a common-sensical manner seems very complex. If only we could take the short cut and program intelligence! ML is perhaps the longer bottom-up path. Can we wait?! The success of machine learning has been what appears to be at the low levels of intelligence: some basic pattern recognition/analysis. The success of ML, however, has been robust in my opinion. There is great strength/science behind the success. Very importantly, I think machine learning, at least the way we conceptualize it currently, while limited, may serve as a path to discovering what new concepts and formalisms we need, for gaining further insights into our higher level faculties (eg intuition, creativity, imagination, and so on).

ML has made much progress, thanks to many researchers from diverse fields, and I am fortunate to have seen its great practical impact first hand (at Yahoo! SRI and Google) Yahoo! And it remains very exciting and alive. For me, the question is always: what do we do next, and in this case, where do we take machine learning from here, keeping in mind, whenever possible, the primary goal of insights into the workings of our minds, and more generally what intelligence can be.

One aspect that has caught my attention is the scale that the sophisticated intelligence of higher animals seems to require: the number of concepts in our minds is huge (probably millions and beyond) (this depends on how we define a concept). Think for example about visual concepts (such as physical things: trees, buildings, various tools, stationary, people,..), or verbal concepts. And yet, the brain does its normal operations (classification and other inferences) relatively rapidly, considering the brain's limited hardware (limited speed and memory). I think this remains unaccounted for by our current know-how. How does our brain acquire/organize/manage such a large "concept base"?! It remains largely a mystery.

Here are candidate subproblems that I have looked at: how do we categorize so efficiently and moreover, how do we efficiently learn to efficiently categorize in the face of so many classes ? Can we design algorithms/systems that behave similarly? I don't think we, as say grown-ups, were always like this (adequately proficient), and depending on the task, it has taken us days, months, and years to reach current capabilities, via learning..

Another big question for me is: how do we acquire so many concepts in the first place? Again, I assume we learn most of them, and that such tasks require much learning and experience.

I have therefore been thinking about very large scale (data intensive) ongoing learning regimes and systems. We need scalable algorithms, and I expect they will be relatively simple and light. That doesn't mean they are easy to find! We need to think in terms of "systems" that learn. Of course, there are many challenges and questions, such as what problems or tasks or how do we identify good tasks/problems and formalize them well (this is a BIG meta problem!), why do we need a "systems" approach, is large-scale really necessary, how do we keep system complexity in check, sustain efficiency, and ensure that our systems don't drown in the sea of uncertainty over their long periods of operations, and so on. I remain optimistic! My publications below delve further into these questions.

And of course, the mind does much more. This should keep us in awe and wonder and busy for some time! Once we have solved the problems, we can kick back, relax, and let our intelligent progeny take over .. :)

Practical Considerations

Many (learning) problems involve a number of the aspects below, or stated another way, if we can design algorithms and systems that can address these challenges, I believe we increase the utility and intelligence in our systems. For example, in the web space, we can better respond to the users of a search engine, that is better personalize and adapt system response to individual users and specific situations.

Large attribute dimensionality: the situation (page, user, query, scene) has a large number of attributes that can be relevant to classification/prediction (thousands, millions,..).
Large class dimensionality: the situation at hand should be classified into a large dimensional space: myriad number of possible categories/classes (thousands, millions, ..).
Noise: attribute values are noisy, for example they are the output of sensors or other classifiers that are imperfect
Noise: class values can be noisy/uncertain/unreliable, for example the clicks (in a search setting) may imply some relevance, but certainly not 100% guaranteed. In general feedback in the environment is noisy.
Huge data (more generally experience), or number of instances: the number of instances can make up for the noise, and in many scenarios, we have lots of instances (depends on looking at the problem the right way..)

These aspects or types of problems require a number of attributes for their solution, including a good subset of the following, depending on task:

Space efficiency (need to process very large possibly never ending data)
Time efficiency (need to process very large possibly never ending data!)
Noise tolerance
Sample efficiency (don't waste instances/time: learn at an appropriate rate).
Other: handle drift (what you learn changes over time), biases in data, ...

The work below proposes a particular task (or activity) and explores algorithms that address some of the challenges. This is work in progress.

My current publications in this track (may be out-dated.. please also see publications page):

Prediction games in infinitely rich worlds ( a description and motivation for a learning task/activity, to address the question of acquiring myriad categories, and a discussion of what it may take to solve):

(Long) Technical Report on the basic idea/philosophy and various considerations/issues/challenges, 2007.
Shorter abstracts or position papers:
Position paper at AAAI FSS07 (pdf),
Abstract at the Learning (Snowbird) workshop 2007 (pdf), and presentation (ppt),
Earliest version at Utility Based Data Mining workshop at KDD'06.

System implementations:

Exploring Massive Learning via a Prediction System, and presentation (ppt), at the AAAI Fall Symposium Series on Computational Approaches to Representation Change During Learning and Development, 2007. This paper describes a first prediction system plus some experiments.

Efficient algorithms and systems for what I call many-class learning (thousands of classes and beyond). The above prediction system ideas are based on the discovery that efficient accurate many-class learning is possible (see below), in particular via efficient "index learning". A first problem was efficient classification (or prediction) in the presence of thousands of classes. The current approach is for each feature to connect to a relatively small set of categories/concepts. The connections form a sparse weighted bipartite graph, mapping features to concepts, or an index (concepts are "indexed" by features, in analogy to documents being indexed by their terms, say in a search engine):

Efficient Online Learning and Prediction of Users' Desktop Behavior. In IJCAI, 2009. With Hung Bui and Eric Yeh.
On Updates that Constrain the Connections of Features During Learning. In ACM KDD, 2008. With Jian Huang.
Error-Driven Generalist+Experts (EDGE): A Multi-stage Ensemble Framework for Text Categorization. In ACM CIKM, 2008. With Jian Huang and C. Lee Giles.
Large-Scale Many-Class Learning. In SIAM Data Mining, 2008. With Michael Connor.

Learning indices that rank (technical report version), abstract (pdf) and presentation at the Learning workshop (ppt).

The initial idea was to learn an unweighted index to drastically reduce the number of candidate concepts at classification time, then classifiers would be applied (i.e., to index binary classifiers, a two-stage solution):

Recall Systems at AISTATS'07, (ppt presentation). With Wiley Greiner, David Kempe and Mohammad Salavatipour.

Learning when Concepts Abound. With Wiley Greiner. Motivates the problem of learning many concepts, and introduces the idea of learning an index. Y! Tech report, 2006.

Back to main page