UPDATE 2019-05-22: The proceedings are available.
We finalized the list of accepted papers and the schedule for the 1st Interdisciplinary Workshop on Algorithm Selection and Meta-Learning in Information Retrieval (AMIR) on 14 April 2019, Cologne, Germany.
Out of 10 submissions, we accepted 3 full-papers and 2 short-papers/demos. We will also have two hands-on sessions about ASLib, auto-sklearn and Auto-PyTorch, and we are delighted to announce Dr. Marius Lindauer (University of Freiburg) as a keynote speaker (more details to follow soon).
Here is the preliminary programme subject to minor changes. The latest programme can always be found at http://amir-workshop.org/amir-2019/for-attendees/programme-accepted-papers/.
08:00 | Registration | |
09:00 | Welcome | Joeran Beel (Trinity College Dublin) and Lars Kotthoff (University of Wyoming) |
09:10 | Keynote | Marius Lindauer (University of Freiburg) |
Title and Abstract: TBA
Biography: Marius research focus lies on the performance tuning of any kind of algorithm (e.g., SAT solvers or machine learning algorithms) using cutting edge techniques from machine learning and optimization. A well-known, but also a tedious, time-consuming and error-prone way to optimize performance (e.g., runtime or prediction loss) is to tune the algorithm’s (hyper-) parameters in a manual way. To lift the burden on developers and users, Marius develops methods to automate the process of parameter tuning and algorithm selection for a given problem at hand (e.g., a machine learning dataset, or a set of SAT formulas). To this end, Marius provides ready-to-use, push-button software that enables users to optimize their software in an easy and efficient way. |
||
10:10 | Algorithm selection with librec-auto | Masoud Mansoury and Robin Burke |
Due to the complexity of recommendation algorithms, experimentation on recommender systems has become a challenging task. Current recommendation algorithms, while powerful, involve large numbers of hyperparameters. Tuning hyperparameters for finding the best recommendation outcome often requires execution of large numbers of algorithmic experiments particularly when multiples evaluation metrics are considered. Existing recommender systems platforms fail to provide a basis for systematic experimentation of this type. In this paper, we describe librec-auto, a wrapper for the well-known LibRec library, which provides an environment that supports automated experimentation. | ||
10:20 | An Extensive Checklist for Building AutoML Systems | Thiloshon Nagarajah and Guhanathan Poravi |
Automated Machine Learning is a research area which has gained a lot of focus in the recent past. But the required components to build an autoML system is neither properly documented nor very clear due to the differences and the recentness of researches. If the required steps are analyzed and brought under a common survey, it will assist in continuing researches. This paper presents an analysis of the components and technologies in the domains of autoML, hyperparameter tuning and meta learning and, presents a checklist of steps to follow while building an AutoML system. This paper is a part of an ongoing research and the findings presented will assist in developing a novel architecture for an autoML system. | ||
10:30 | Coffee | |
11:00 | Investigating Ad-Hoc Retrieval Method Selection with Features Inspired by IR Axioms | Siddhant Arora and Andrew Yates |
We consider the algorithm selection problem in the context of ad-hoc information retrieval. Given a query and a pair of retrieval methods, we propose a meta-learner that predicts how to combine the methods’ relevance scores into an overall relevance score. These predictions are based on features inspired by IR axioms that quantify properties of the query and its top rank documents. We conduct an evaluation on TREC benchmark data and find that the meta-learner often significantly improves over the individual methods in terms of both nDCG@20 and P@30. Finally, we conduct a feature weight analysis to investigate which features the meta-learner uses to make its decisions. | ||
11:30 | Augmenting the DonorsChoose.org Corpus for Meta-Learning | Gordian Edenhofer, Andrew Collins and Joeran Beel |
The DonorsChoose.org dataset of past donations provides a big and feature-rich corpus of users and items. The dataset matches donors to projects in which they might be interested in and hence is intrinsically about recommendations. Due to the availability of detailed item-, user- and transaction-features, this corpus represents a suitable candidate for meta-learning approaches to be tested. This study aims at providing an augmented corpus for further recommender systems studies to test and evaluate meta-learning approaches. In the augmentation, metadata of collaborative and content-based filtering techniques is amended to the corpus. It is further extended with aggregated statistics of users and transactions and an exemplary meta-learning experiment. The performance in the learning subsystem is measured via the recall of recommended items in a Top-N test set. The augmented dataset and the source code are released into the public domain at GitHub:BeelGroup/Augmented-DonorsChoose.org-Dataset. | ||
12:00 | RARD II: The 94 Million Related-Article Recommendation Dataset | Joeran Beel, Barry Smyth and Andrew Collins |
The main contribution of this paper is to introduce and describe a new recommender-systems dataset (RARD II). It is based on data from a recommender-system in the digital library and reference management software domain. As such, it complements datasets from other domains such as books, movies, and music. The RARD II dataset encompasses 94m recommendations, delivered in the two years from September 2016 to September 2018. The dataset covers an item-space of 24m unique items. RARD II provides a range of rich recommendation data, beyond conventional ratings. For example, in addition to the usual ratings matrices, RARD II includes the original recommendation logs, which provide a unique insight into many aspects of the algorithms that generated the recommendations. The recommendation logs enable researchers to conduct various analyses about a real-world recommender system. This includes the evaluation of meta-learning approaches for predicting algorithm performance. In this paper, we summarise the key features of this dataset release, describe how it was generated and discuss some of its unique features. Compared to its predecessor RARD, RARD II contains 64% more recommendations, 187% more features (algorithms, parameters, and statistics), 50% more clicks, 140% more documents, and one additional service partner (JabRef). | ||
12:30 | Lunch | |
13:30 | Hands-on Session with ASlib | Lars Kotthoff |
ASlib is a standard format for representing algorithm selection systems and a bechmark library with example problems from many different application domains. I will give an overview of what it is, example analyses available on its website, and the algorithm selection competitions 2015 and 2017 that were based on it. ASlib is available at http://www.aslib.net. | ||
14:00 | Hands-on Session with auto-sklearn and Auto-PyTorch | Marius Lindauer |
auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. auto-sklearn frees a machine learning user from algorithm selection and hyperparameter tuning. It leverages recent advantages in Bayesian optimization, meta-learning and ensemble construction. Auto-PyTorch is an automatic architecture search and hyperparameter optimization for PyTorch. | ||
14:30 | Poster Session | |
A 1-hour poster session in which all previous speakers present their work as a poster. The poster session gives the attendees the oportunity to discuss the work of the presenters in more depth. The poster session will start at 14:30 and continue throughout the coffee break. | ||
15:00 | Coffee & Poster Session Cont’d | |
15:30 | Open Discussion | |
Details TBA | ||
16:30 | Closing Remarks | Joeran Beel (Trinity College Dublin) and Lars Kotthoff (University of Wyoming) |
16:45 | End |