Using deep learning to answer Aristo's science questions
This repository contains code for training deep learning systems to do question answering tasks. Our primary focus is on Aristo’s science questions, though we can run various models on several popular datasets.
This library is built on top of Keras (for actually training and executing deep models) and is also designed to be used with an experiment framework written in Scala, which you can find here: Deep QA Experiments.
Running experiments with python
Although we recommend using the Deep QA Experiments library to run reproducible experiments, this library is designed as standalone software which runs deep NLP models using json specification files. To do this, from the base directory, you run the command python src/main/python/run_solver.py [model_config]. You must use python >= 3.5, as we make heavy use of the type annotations introduced in python 3.5 to aid in code readability (I recommend using anaconda to set up python 3, if you don’t have it set up already).
You can see some examples of what model configuration files look like in the example experiments directory. We try to keep these up to date, but the way parameters are specified is still sometimes in a state of flux, so we make no promises that these are actually usable with the current master. Looking at the most recently added or changed example experiment should be your best bet to get an accurate format. If you find one that’s out of date, submitting a pull request to fix it would be great.
Finally, the way parameters are parsed in DeepQA can be a little confusing. When you provide a json specification, various classes will pop things from this dictionary of values (actually pop them, so they aren’t in the parameter dict any more). This is helpful because it allows you to check that all of the parameters you pass are used at some point, preventing hard to find bugs, as well as enabling clear separation of functionality because there are no globally defined variables, such as is often the case with other argument parsing methods.
The deep_qa library is organised into the following main sections:
- Common: Code for parameter parsing, logging and runtime checks.
- Contrib: Related code for experiments and untested layers, models and features. Generally untested.
- Data: Indexing, padding, tokenisation, stemming, embedding and general dataset manipulation happens here.
- Layers: The bulk of the library. Use these Layers to compose new models. Some of these Layers are very similar to what you might find in Keras, but altered slightly to support arbitrary dimensions or correct masking.
- Models: Frameworks for different types of task. These generally all extend the TextTrainer class which provides training capabilities to a DeepQaModel. We have models for Sequence Tagging, Entailment, Multiple Choice QA, Reading Comprehension and more. Take a look at the README for more details.
- Tensors: Convenience functions for writing the internals of Layers. Will almost exclusively be used inside Layer implementations.
- Training: This module does the heavy lifting for training and optimisation. We also wrap the Keras Model class to give it some useful debugging functionality.
We’ve tried to also give reasonable documentation throughout the code, both in docstring comments and in READMEs distributed throughout the code packages, so browsing github should be pretty informative if you’re confused about something. If you’re still confused about how something works, open an issue asking to improve documentation of a particular piece of the code (or, if you’ve figured it out after searching a bit, submit a pull request containing documentation improvements that would have helped you).
This repository implements several variants of memory networks, including the models found in these papers:
- The original MemNN, from Memory Networks, by Weston, Chopra and Bordes
- End-to-end memory networks, by Sukhbaatar and others (close, but still in progress)
- Dynamic memory networks, by Kumar and others
- DMN+, from Dynamic Memory Networks for Visual and Textual Question Answering, by Xiong, Merity and Socher
- The attentive reader, from Teaching Machines to Read and Comprehend, by Hermann and others
- Gated Attention Reader from Gated Attention Readers for Text Comprehension,
- Bidirectional Attention Flow, from Bidirectional Attention Flow for Machine Comprehension,
- Decomposable Attention, from A Decomposable Attention Model for Natural Language Inference,
- Windowed-memory MemNNs, from The Goldilocks Principle: Reading Children’s Books with Explicit Memory Representations (in progress)
As well as some of our own, as-yet-unpublished variants. There is a lot of similarity between the models in these papers, and our code is structured in a way to allow for easily switching between these models. As an example of this modular approach, here is a description of how we’ve built an extensible memory network architecture in this library: this readme. # Datasets
This code allows for easy experimentation with the following datasets:
- AI2 Elementary school science questions (no diagrams)
- The Facebook Children’s Book Test dataset
- The Facebook bAbI dataset
- The NewsQA dataset
- The Stanford Question Answering Dataset (SQuAD)
- The Who Did What dataset
If you use this code and think something could be improved, pull requests are very welcome. Opening an issue is ok, too, but we’re a lot more likely to respond to a PR. The primary maintainer of this code is Matt Gardner, with a lot of help from Pradeep Dasigi (who was the initial author of this codebase), Mark Neumann and Nelson Liu.
This code is released under the terms of the Apache 2 license.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|File Name & Checksum SHA256 Checksum Help||Version||File Type||Upload Date|
|deep_qa-0.1.1-py3-none-any.whl (372.0 kB) Copy SHA256 Checksum SHA256||py3||Wheel||Apr 14, 2017|
|deep_qa-0.1.1.tar.gz (202.5 kB) Copy SHA256 Checksum SHA256||–||Source||Apr 14, 2017|