You are given a task of:
Analyzing a big dataset 2TBs of contractual JSON docs organized by doc_id
Tagging entities (companies) in the documents using entity_id from a collection of company profiles
Building a search system that in response to the query
Retrieves related documents ranked by their relevance
Retrieves related companies and their products
Question: What kinds of databases you would use for this task and why?