How we transformed 10,000+ research articles into an intelligent knowledge base for you to use
We started with a mission: gather a database of the most comprehensive set of journal articles across behavior analytic journals. This meant that we scoured the publically available journals to find the articles that are freely available, curated them, and then added them into a database.
Over 24 months, we collected more than 10,000 peer-reviewed articles, spanning decades of research. Each article was carefully cataloged by publication date, authors, and journal.
Raw PDFs and documents aren't machine-readable. We needed to develop a way to extract while preserving the structure of each document.
Our parser handled complex elements in the pdf documents like a champ. It extracted the text, tables, and citations, ensuring we preserved the full context of each research finding.
Here's where the magic happens. We transformed each document into numerical vectors using advanced vector embedding models. Think of it as converting human knowledge into a format computers can understand.
This allows us to find connections between research that might not be obvious to human readers. Using the vector embeddings, we can find the most relevant research for a users question.
Simplified 2D visualization of our high-dimensional vector space
Each document became a point in a high-dimensional space, where similar concepts group together. This allows us to calculate distance metrics between documents and find the most similar research papers.
The final step was building an interface that makes this knowledge accessible. Our vector store allows for searching information based on meaning, not just keywords.
Example Query:
Results:
Functional analysis of separate topographies of aberrant behavior.
Derby & Wacker, 1994
Experimental analysis and treatment of multiply controlled problem behavior: a systematic replication and extension
Borrero & Vollmer, 2006
Functional assessment of problem behavior: dispelling myths, overcoming implementation obstacles, and developing new lore.
Hanley, 2012
Users can ask questions in natural language and receive relevant research findings, complete with citations and the context that supports the answer. The system continuously improves as more research is added.
Our vector store technology powers the Research Buddy tool, giving you instant access to insights from thousands of research papers. Ask questions in plain English and get answers as well as the context that supports the answer.
Try Research Buddy