I work as a researcher in the AI Foundations department of the IBM T.J. Watson Research Center, where my focus is developing new machine learning algorithms with artificial neural networks. My primary interest area is continual lifelong deep learning.

AI has recently captured public interest by achieving super human performance in games like Jeopardy, Chess, and Go, which were previously dominated by highly intelligent humans. However, the intelligence achieved by these AI programs still remains quite different and less generally useful than human intelligence. Current AI techniques, particularly with the rise of Deep Learning, have largely mastered the ability to create what has been called Narrow AI. Narrow AI can be extremely well tuned to a particular skill, but demonstrates a remarkable inability to apply its knowledge quickly to other skills or learn new skills over time. My research focuses on building a better understanding of the difficulties and potential solutions for developing AI that efficiently obtains many competencies over a lifetime. If successful, this work will enable a more general form of AI that can perform high quality continual reinforcement learning. This would allow AI to improve many more industries and lives around the globe than it can today in its current form.

Contact: mdriemer at us dot ibm dot com

GitHub | Google Scholar | LinkedIn

Research Highlights

Navigating the Weight Sharing Dilemma of Lifelong Learning


Neural Networks latently decide when to compress knowledge and when to orthogonalize knowledge from different experiences. However, it is actually very difficult to do this correctly and doing so has profound consequences for the network’s ability to perform lifelong learning over a non-stationary distribution of experiences. Our work attempts to make neural networks better able to learn to solve this problem by providing direct supervision to the network so that it can learn to find the right balance of weight sharing over the course of learning. Additionally, our work aims to allow the network to self organize in a way that eliminates the potential for interference across examples.

Recursive Routing Networks: Learning to Compose Modules for Language Understanding (NAACL 2019)
Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference (ICLR 2019)
Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning (ICLR 2018)

Learning with Temporal Abstraction


Learning with temporally abstract actions rather than primitive (low level) actions is one of the central goals of hierarchical reinforcement learning. If learned successfully, abstract actions have the potential to ease fundamental problems encountered in continual reinforcement learning such as long term credit assignment and efficient environment exploration. Our work below attempts to learn a hierarchy of low level and higher level actions in a general purpose way by proposing new policy gradient theorems that enable neural networks to learn action hierarchies from scratch.

On the Role of Weight Sharing During Deep Option Learning (AAAI 2020)
Learning Abstract Options (NeurIPS 2018)

Transferring Knowledge Between AI Models


Understanding how to transfer knowledge across AI models is critical to developing lifelong learning agents with more knowledge and complexity. Many theories for how the brain work, including the Complementary Learning System (CLS) theory for how humans learn without catastrophic forgetting, rely on the notion of effective gradual transfer between different sub-systems of the human brain over time. Additionally, collective human knowledge has undoubtedly progressed faster as a result of knowledge sharing within large societies of humans. Our work below attempts to build AI systems that begin to leverage these dynamics in order to develop more successful lifelong learning agents.

Learning Hierarchical Teaching Policies for Cooperative Agents (AAMAS 2020)
Learning to Teach in Cooperative Multiagent Reinforcement Learning (AAAI 2019 – Outstanding Student Paper Honorable Mention)
Scalable Recollections for Continual Lifelong Learning (AAAI 2019)


I have also been quite interested in the challenging problem of multi-task learning for time series prediction based on many exogenous factors.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s