Need to find answers to hard questions across multiple sources, including your private data? Use our Knowledge Agent, powered by AI search to scan up to billions of rows of any data - images, PDFs, text, tables and more, and provide a well-researched answer.
What about my local files on my computer? Or is it cloud first? I might need to sync files to dropbox or google cloud first maybe?
Congrats on the launch, @david_buniatyan! I saw Deep Lake works with all kinds of data. How do you handle something like videos or images compared to text?
The accuracy is phenomenal. Kudos to Davit and the Activeloop team for turning fiction into reality!
This is exciting! Compelling demo!
I am curious how effective Deep Lake's integrated knowledge retrieval approach is for avoiding hallucinations and finding relevant articles not found by other tools in the same space?
The traditional stack to chat with your own multimodal data is indeed becoming too large to maintain.
Deep Lake is a game changer - it takes away all the complexity, while maintaining similar quality.
Congrats on the launch of Deep Lake AI Knowledge Agent! The ability to perform deep research across multiple data types and sources is impressive!
Behind every great product there is a great team. Congrats Davit and the team with the launch.
Congrats on the launch @david_buniatyan and team!
Really interested in testing this out.
Have you tried testing it with PDF data that includes architectural blueprints, technical drawings, and handwritten comments?
We have a specific use case involving construction projects—would love to see how it handles this!
Deep knowledge retrieval is the future. Kudos to the builders of this tool.
Wow, this is exciting! At Cloudchipr, we store a vast amount of data in object storages with diverse structures - from CSVs to time series and key-value data. This is a game-changer for us in generating various general statistics and empowering customers to "talk to with their data" without being restricted by data type.
Looks like a very powerful tool! Does it support searching over Google Drive as well?
Do you guys provide any API for deep research of your tool?
This is exciting! I faced similar knowledge base challenges during my time at an omnichannel e-commerce company. Would love to try it out :)
This looks like an incredibly useful tool for tackling complex, multi-source research questions. The ability to search across diverse data types and extract well-researched answers is definitely something the community will benefit from. Looking forward to seeing how people adopt it in different use cases.
Your data is your ultimate competitive advantage! Leveraging it effectively isn’t just an option anymore - it’s the key to staying ahead. Exciting to see solutions like Activeloop Agent unlocking its full potential, driving smarter decisions, and creating real impact!
How does it handle data quality and relevance when dealing with diverse sources ?
Hi Product Hunt!
I'm Sasun, Activeloop's (YC S18) Director of Engineering. I've previously co-founded Pixomatic, one of the early successful photo-editing apps. Naturally, one of the things that excites me is how to visualize (and query) unstructured data, like images.
Except… back in the day, there was no SQL for images.
Then I met @david_buniatyan, who started Activeloop with that mission - make the way complex data - like images, videos, text, etc. stored in a more organized way - and make it easily connectible to AI (for training, and asking questions!).
This comes with a number of exciting technical challenges.
1. Unstructured data is… well… unstructured. It's hard to search across such data (imagine saying I want all the images that contain bicycles larger than 200x350 pixels, and two people in them).
Retrieval systems until Deep Lake weren't fit for that.
2. Vector Search is inaccurate. Achieving accuracy in AI-generated insights is challenging, especially in sectors like legal and healthcare where accuracy is paramount. The issue magnifies with scale—for instance, when searching through the world’s entire scientific research corpus.
Most data lives in data lakes (S3, AWS, GCP)
3. Limited Memory Bolting a vector index onto traditional database architectures does not provide the scalability required by AI workloads. As the scale of your dataset increases, the memory and compute requirements scale linearly. For datasets that grew past 100M, the cost becomes prohibitive to maintain the index in memory.
My team and I focused on building this as Deep Lake's ‘unfair advantage’, since we're geared more towards analytical cases where the users need to ask questions across complex, large data. As a result, we're up to 10x more efficient than in-memory approaches.
4. AI Agents can fail… spectacularly
Not claiming we've totally solved this issue, but if there's even 1% probability of failing or responding inaccurately every time, in a complex, multi-step system there will be this ‘butterfly’ effect where with every additional step, the probability to fail will increase.
So increasing retrieval accuracy is important - in critical verticals (autonomous driving, life sciences, healthcare, finance) it can be either a matter of life and death, or incalculable losses.