Contact Us

Blog

NearUp connects SMBs and SMEs with a full-cycle desktop, web, and
mobile software development team to build solutions tailored to their
unique requirements.

NearUp connects SMBs and SMEs with a full-
cycle desktop, web, and mobile software
development team to build solutions
tailored to their unique requirements.

Contact Us
Contact Us

The Best Blogs

Blogs

NearUp connects SMBs and SMEs with a full-
cycle desktop, web, and mobile software
development team to build solutions
tailored to their unique requirements.

Contact Us
Contact Us

API Gateway for MongoDB for CRUD 
and vector search

June 7, 2024

Product: API Gateway for MongoDB for CRUD and vector search

Industry: Dev Tech, AI

Market: Software engineers and architects, B2B for IT departments

Start Date: January 2024  

Status: In development (currently available for US here)

Technologies: Typescript, MongoDB Atlas, Bun, Vector search, LLM, Docker

About the project

Artificial intelligence and machine learning models are transforming the way we interact with information. These models can access large datasets and effectively handle and use large volumes of data, creating new opportunities across industries.

Armed with first-hand insights into just how many doors were opened with AI and ML capabilities, the Nearup development team decided to contribute with our own solution. While our idea is still in the works, we wanted to give you a quick sneak peek into Nodb: an API gateway that leverages modern AI techniques to extract information using natural language and remove complexities of CRUD operations, pagination, and sorting.

The idea for the Nodb was born from our desire to provide our peers with a vector search solution that would cut the time they spend searching documents to understand how a particular function works. From an end-user point of view, Nodb vectorizes their questions and matches them to the nearest neighbor to provide immediate, detailed, and relevant feedback.

Project duration

The Nearup team began development in January 2024. Our next steps will involve enabling integration with some of the biggest messaging and collaboration tools, including WhatsApp, Telegram, and Slack. Additionally, we plan to enhance document querying capabilities through native AI parsing and enrichment prompts.

Our process and goals

We are developing unified access to multiple different databases, like MongoDB, Postgres, Redis, MySQL, and more. Nodb builds embedding models of everything it saves in its database, and this information can later be retrieved via natural language processing (NLP). 

Our API simplifies database access and installation processes. Using a unified API gateway, there is no need to write separate code for each database or install them locally in the project. Instead, they can be accessed seamlessly through HTTP.

Note that the API we are creating deals with JSON data only, which means we can leverage OCR techniques to process documents like PDFs into JSON and store them in our database, just like any other database record.

What can Nodb be used for?

So, what is our end goal? In simplest terms, it is to enable users to “Talk to the database.” Using vector search, we can conduct semantic searches and utilize Retrieval Augmented Generation (RAG), incorporating chat history as contextual information to engage in meaningful interactions with our data.

For instance, imagine a project manager looking to assign work based on their team’s capabilities. The PM could ask, “Does Anna know how to work in React?” Relying on the information processed and kept in the database, they could receive feedback along the lines of, “No, she doesn’t have React skills in her CV.” 

But it’s a lot more complex than that. The Nearup team built the API to facilitate several AI development use cases, primarily:

Nodb is built to handle different types of documents, including ID cards, receipts, PDF files, images, and more, which users submit for processing through optical character recognition (OCR) technology.

This will particularly be useful for HR professionals, who could rely on Nodb to request “first and last name, skills, and last two employers” for a job applicant. Ultimately, we see our product reducing the time it takes to review applicant profiles and browse large volumes of information before uncovering the ones that best match the criteria. 

We are also confident in the successful application of Nodb in eCommerce, where traditional search methods struggle to capture the context and history that influence user preferences. Vector search capabilities of our solution will match user queries to their specific information, like preferences, interests, hobbies, and any other information that can be built into a detailed data representation. This will allow for personalized recommendations based on similarity to the user's profile.

Our vector search implementation will help transform the content discovery experience. Unlike traditional search methods constrained by keywords, vector search represents product features as vectors in a multi-dimensional space. When users search for specific attributes, user queries are compared against the vectors representing all products in the dataset. Based on similarity, this comparison swiftly identifies the product and closely aligns it with the user's preferences.  

As such, Nodb will allow you to offer a more intuitive and personalized user experience, enhancing efficiency and satisfaction in finding relevant options. For example, users will be able to send a URL of a page they want to learn more about, ask any question about that page, and retrieve information such as Title Tag, meta description, or a complete list of products featured on the page (along with essential info like product name, price, and specifications). 

When building Nodb, we also thought about the fact that vector search can aid in deepening the interaction with technologies like chatbots, virtual assistants, and language translators, making them more conversational and intuitive. 

So, in a situation where a user needs a template for a memo, Nodb (empowered by vector search) will generate the memo itself, guided by the user's constraints. How? By using NLP and nearest neighbor vectors from similar documents in AI LLMs, vector search helps AI understand and respond to queries with relevant information. This approach enhances the accuracy and efficiency of NLP applications and allows for a deeper understanding of user intent and context.

Approach & technology stack

We chose technologies based on our ultimate goal: to offer faster, easier, more accurate, and relevant information search. With that in mind, we settled on:

Some of you will take one look at this list and instantly understand our selection process. But for those of you who are not as tech-savvy, let us tell you a bit more about vector search since it’s fueling the entire operation. 

Vector search

Vector search (or the nearest neighbor search) is an AI and data retrieval method that relies on mathematical vectors to perform searches within complex, unstructured datasets. It allows you to find specific sets of information most closely related to the search query. 

While traditional models look for exact match queries, vector search considers different query dimensions. Vector compares the query with the one with similar features or attributes, looking for a possible vector path that traverses all dimensions.  

It is a more nuanced, sophisticated, and most importantly - more accurate way to search through complex datasets. It extracts contextual relevance that can be applied across different applications (that we hope to see Nodb utilized for as well!):

Let's take a little bit deeper dive into RAG’s potential.

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) tackles knowledge-intensive tasks by combining an information retrieval component with a text generation model. This hybrid approach allows RAG to access external knowledge sources efficiently and adaptively without retraining the entire model.

Here’s how it works: RAG takes input and retrieves a set of relevant documents from a knowledge source. These documents are then combined with the original input and provided as context to the text generation model that uses this context to produce the final output. This approach allows RAG to adapt to situations where facts may change over time, unlike traditional language models characterized by static knowledge.

RAG is highly beneficial for domain-specific and knowledge-intensive scenarios that require continual knowledge updates. Because it enables language models to access the latest information without retraining, it is useful for generating reliable outputs that reflect the most up-to-date information. 

The bottom line is that RAG enhances the accuracy, relevancy, and controllability of the LLM’s response. 

Key Nodb features and solutions

At the time of writing, Nodb was still in the early stages of development. What we can share from our Project Specifications is the list of features we intend to roll out in phases:

API Gateway for MongoDB for CRUD 
and vector search

Jun 7, 2024

Product: API Gateway for MongoDB for CRUD and vector search

Industry: Dev Tech, AI

Market: Software engineers and architects, B2B for IT departments

Start Date: January 2024  

Status: In development (currently available for US here)

Technologies: Typescript, MongoDB Atlas, Bun, Vector search, LLM, Docker

About the project

Artificial intelligence and machine learning models are transforming the way we interact with information. These models can access large datasets and effectively handle and use large volumes of data, creating new opportunities across industries.

Armed with first-hand insights into just how many doors were opened with AI and ML capabilities, the Nearup development team decided to contribute with our own solution. While our idea is still in the works, we wanted to give you a quick sneak peek into Nodb: an API gateway that leverages modern AI techniques to extract information using natural language and remove complexities of CRUD operations, pagination, and sorting.

The idea for the Nodb was born from our desire to provide our peers with a vector search solution that would cut the time they spend searching documents to understand how a particular function works. From an end-user point of view, Nodb vectorizes their questions and matches them to the nearest neighbor to provide immediate, detailed, and relevant feedback.

Project duration

The Nearup team began development in January 2024. Our next steps will involve enabling integration with some of the biggest messaging and collaboration tools, including WhatsApp, Telegram, and Slack. Additionally, we plan to enhance document querying capabilities through native AI parsing and enrichment prompts.

Our process and goals

We are developing unified access to multiple different databases, like MongoDB, Postgres, Redis, MySQL, and more. Nodb builds embedding models of everything it saves in its database, and this information can later be retrieved via natural language processing (NLP). 

Our API simplifies database access and installation processes. Using a unified API gateway, there is no need to write separate code for each database or install them locally in the project. Instead, they can be accessed seamlessly through HTTP.

Note that the API we are creating deals with JSON data only, which means we can leverage OCR techniques to process documents like PDFs into JSON and store them in our database, just like any other database record.

What can Nodb be used for?

So, what is our end goal? In simplest terms, it is to enable users to “Talk to the database.” Using vector search, we can conduct semantic searches and utilize Retrieval Augmented Generation (RAG), incorporating chat history as contextual information to engage in meaningful interactions with our data.

For instance, imagine a project manager looking to assign work based on their team’s capabilities. The PM could ask, “Does Anna know how to work in React?” Relying on the information processed and kept in the database, they could receive feedback along the lines of, “No, she doesn’t have React skills in her CV.” 

But it’s a lot more complex than that. The Nearup team built the API to facilitate several AI development use cases, primarily:

  • Documentation intelligence 

Nodb is built to handle different types of documents, including ID cards, receipts, PDF files, images, and more, which users submit for processing through optical character recognition (OCR) technology.

This will particularly be useful for HR professionals, who could rely on Nodb to request “first and last name, skills, and last two employers” for a job applicant. Ultimately, we see our product reducing the time it takes to review applicant profiles and browse large volumes of information before uncovering the ones that best match the criteria. 

We are also confident in the successful application of Nodb in eCommerce, where traditional search methods struggle to capture the context and history that influence user preferences. Vector search capabilities of our solution will match user queries to their specific information, like preferences, interests, hobbies, and any other information that can be built into a detailed data representation. This will allow for personalized recommendations based on similarity to the user's profile.

  • Web page scraping and content discovery

Our vector search implementation will help transform the content discovery experience. Unlike traditional search methods constrained by keywords, vector search represents product features as vectors in a multi-dimensional space. When users search for specific attributes, user queries are compared against the vectors representing all products in the dataset. Based on similarity, this comparison swiftly identifies the product and closely aligns it with the user's preferences.  

As such, Nodb will allow you to offer a more intuitive and personalized user experience, enhancing efficiency and satisfaction in finding relevant options. For example, users will be able to send a URL of a page they want to learn more about, ask any question about that page, and retrieve information such as Title Tag, meta description, or a complete list of products featured on the page (along with essential info like product name, price, and specifications). 

  • Natural language processing (NLP)

When building Nodb, we also thought about the fact that vector search can aid in deepening the interaction with technologies like chatbots, virtual assistants, and language translators, making them more conversational and intuitive. 

So, in a situation where a user needs a template for a memo, Nodb (empowered by vector search) will generate the memo itself, guided by the user's constraints. How? By using NLP and nearest neighbor vectors from similar documents in AI LLMs, vector search helps AI understand and respond to queries with relevant information. This approach enhances the accuracy and efficiency of NLP applications and allows for a deeper understanding of user intent and context.

Approach & technology stack

We chose technologies based on our ultimate goal: to offer faster, easier, more accurate, and relevant information search. With that in mind, we settled on:

  • Languages and Frameworks: Typescript, Bun runtime, Honojs
  • Platform: Docker
  • Database: MongoDB Atlas
  • LLM and Embedding Models: OpenAI, Anthropic Claude, Voyage AI

Some of you will take one look at this list and instantly understand our selection process. But for those of you who are not as tech-savvy, let us tell you a bit more about vector search since it’s fueling the entire operation. 

Vector search

Vector search (or the nearest neighbor search) is an AI and data retrieval method that relies on mathematical vectors to perform searches within complex, unstructured datasets. It allows you to find specific sets of information most closely related to the search query. 

While traditional models look for exact match queries, vector search considers different query dimensions. Vector compares the query with the one with similar features or attributes, looking for a possible vector path that traverses all dimensions.  

It is a more nuanced, sophisticated, and most importantly - more accurate way to search through complex datasets. It extracts contextual relevance that can be applied across different applications (that we hope to see Nodb utilized for as well!):

  • Retrieve similar information – Vector search creates the thesaurus for the application, not just for “words” but for the entire datasets. This way, it adapts to the contextual output more directly, enabling users to find variations relevant to their query faster.
  • Filter content and make valuable recommendations – It can identify content with similar features as it moves beyond keyword association and considers thousands of different contextual data points. 
  • Retrieval Augmented Generation (RAG) – Vector search helps extract meaning from the available data. It works by organizing data in a way that captures its semantic value. Users can continually add more context to our datasets, allowing the AI to generate more relevant responses based on the context it learns from the data.

Let's take a little bit deeper dive into RAG’s potential.

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) tackles knowledge-intensive tasks by combining an information retrieval component with a text generation model. This hybrid approach allows RAG to access external knowledge sources efficiently and adaptively without retraining the entire model.

Here’s how it works: RAG takes input and retrieves a set of relevant documents from a knowledge source. These documents are then combined with the original input and provided as context to the text generation model that uses this context to produce the final output. This approach allows RAG to adapt to situations where facts may change over time, unlike traditional language models characterized by static knowledge.

RAG is highly beneficial for domain-specific and knowledge-intensive scenarios that require continual knowledge updates. Because it enables language models to access the latest information without retraining, it is useful for generating reliable outputs that reflect the most up-to-date information. 

The bottom line is that RAG enhances the accuracy, relevancy, and controllability of the LLM’s response. 

Key Nodb features and solutions

At the time of writing, Nodb was still in the early stages of development. What we can share from our Project Specifications is the list of features we intend to roll out in phases:

  • Multiple environments within each application 
  • Resource (entities) split between application collections
  • CRUD operations on single or multiple entities (GET, POST, PUT, PATCH, DELETE)
  • Vector search option using various embedding models (OpenAI, Voyage AI)
  • RAG feature using various LLMs (OpenAI, Claude)
  • Documents processing and storage (.pdf, .csv, .docx, etc.)
  • User-defined API options via environment variables
  • One-click deployment to Render 
  • more to come!
© 2024, NearUp. All Rights Reserved