"machine learning papers" via mikiobraun
Sequence optimization, where the items in a list are ordered to maximize some reward has many applications such as web advertisement placement, search, and control libraries in robotics. Previous work in sequence optimization produces a static ordering that does not take any features of the item or context of the problem into account. In this work, we propose a general approach to order the items within the sequence based on the context (e.g., perceptual information, environment description, and goals). We take a simple, efficient, reduction-based approach where the choice and order of the items is established by repeatedly learning simple classifiers or regressors for each "slot" in the sequence. Our approach leverages recent work on submodular function maximization to provide a formal regret reduction from submodular sequence optimization to simple cost-sensitive prediction. We apply our contextual sequence prediction algorithm to optimize control libraries and demonstrate results on two robotics problems: manipulator trajectory prediction and mobile robot path planning.
We describe Information Forests, an approach to classification that generalizes Random Forests by replacing the splitting criterion of non-leaf nodes from a discriminative one -- based on the entropy of the label distribution -- to a generative one -- based on maximizing the information divergence between the class-conditional distributions in the resulting partitions. The basic idea consists of deferring classification until a measure of "classification confidence" is sufficiently high, and instead breaking down the data so as to maximize this measure. In an alternative interpretation, Information Forests attempt to partition the data into subsets that are "as informative as possible" for the purpose of the task, which is to classify the data. Classification confidence, or informative content of the subsets, is quantified by the Information Divergence. Our approach relates to active learning, semi-supervised learning, mixed generative/discriminative learning.
Suppose that we observe noisy linear measurements of an unknown signal that can be modeled as the sum of two component signals, each of which arises from a nonlinear sub-manifold of a high dimensional ambient space. We introduce SPIN, a first order projected gradient method to recover the signal components. Despite the nonconvex nature of the recovery problem and the possibility of underdetermined measurements, SPIN provably recovers the signal components, provided that the signal manifolds are incoherent and that the measurement operator satisfies a certain restricted isometry property. SPIN significantly extends the scope of current recovery models and algorithms for low dimensional linear inverse problems and matches (or exceeds) the current state of the art in terms of performance.
We consider the problem of finding the graph on which an epidemic cascade spreads, given only the times when each node gets infected. While this is a problem of importance in several contexts -- offline and online social networks, e-commerce, epidemiology, vulnerabilities in infrastructure networks -- there has been very little work, analytical or empirical, on finding the graph. Clearly, it is impossible to do so from just one cascade; our interest is in learning the graph from a small number of cascades.
For the classic and popular "independent cascade" SIR epidemics, we analytically establish the number of cascades required by both the global maximum-likelihood (ML) estimator, and a natural greedy algorithm. Both results are based on a key observation: the global graph learning problem decouples into $n$ local problems -- one for each node. For a node of degree $d$, we show that its neighborhood can be reliably found once it has been infected $O(d^2 \log n)$ times (for ML on general graphs) or $O(d\log n)$ times (for greedy on trees). We also provide a corresponding information-theoretic lower bound of $\Omega(d\log n)$; thus our bounds are essentially tight. Furthermore, if we are given side-information in the form of a super-graph of the actual graph (as is often the case), then the number of cascade samples required -- in all cases -- becomes independent of the network size $n$.
Finally, we show that for a very general SIR epidemic cascade model, the Markov graph of infection times is obtained via the moralization of the network graph.
We propose a new yet natural algorithm for learning the graph structure of general discrete graphical models (a.k.a. Markov random fields) from samples. Our algorithm finds the neighborhood of a node by sequentially adding nodes that produce the largest reduction in empirical conditional entropy; it is greedy in the sense that the choice of addition is based only on the reduction achieved at that iteration. Its sequential nature gives it a lower computational complexity as compared to other existing comparison-based techniques, all of which involve exhaustive searches over every node set of a certain size. Our main result characterizes the sample complexity of this procedure, as a function of node degrees, graph size and girth in factor-graph representation. We subsequently specialize this result to the case of Ising models, where we provide a simple transparent characterization of sample complexity as a function of model and graph parameters.
For tree graphs, our algorithm is the same as the classical Chow-Liu algorithm, and in that sense can be considered the extension of the same to graphs with cycles.
We describe Information Forests, an approach to classification that generalizes Random Forests by replacing the splitting criterion of non-leaf nodes from a discriminative one -- based on the entropy of the label distribution -- to a generative one -- based on maximizing the information divergence between the class-conditional distributions in the resulting partitions. The basic idea consists of deferring classification until a measure of "classification confidence" is sufficiently high, and instead breaking down the data so as to maximize this measure. In an alternative interpretation, Information Forests attempt to partition the data into subsets that are "as informative as possible" for the purpose of the task, which is to classify the data. Classification confidence, or informative content of the subsets, is quantified by the Information Divergence. Our approach relates to active learning, semi-supervised learning, mixed generative/discriminative learning.
Inverse reinforcement learning (IRL) addresses the problem of recovering a task description given a demonstration of the optimal policy used to solve such a task. The optimal policy is usually provided by an expert or teacher, making IRL specially suitable for the problem of apprenticeship learning. The task description is encoded in the form of a reward function of a Markov decision process (MDP). Several algorithms have been proposed to find the reward function corresponding to a set of demonstrations. One of the algorithms that has provided best results in different applications is a gradient method to optimize a policy squared error criterion. On a parallel line of research, other authors have presented recently a gradient approximation of the maximum likelihood estimate of the reward signal. In general, both approaches approximate the gradient estimate and the criteria at different stages to make the algorithm tractable and efficient. In this work, we provide a detailed description of the different methods to highlight differences in terms of reward estimation, policy similarity and computational costs. We also provide experimental results to evaluate the differences in performance of the methods.