Software Engineer Intern (Summer 2025)
About Weaviate
Weaviate is a leading vector database company that powers AI-native search and RAG applications for some of the world’s largest companies. Weaviate focuses on enabling the developer to get to value quickly while keeping costs under control – even at a massive scale.
Our team members work remotely across the globe with the flexibility to work from anywhere and at any time. Our people experience this as a massive benefit! Operating with a strong sense of ownership and collaboration, our teams prioritize results while empowering each individual to do their best work.
Join us for the summer and work with some of the world’s brightest and kindest (!) database engineers on the planet to push the boundaries of vector and hybrid search, RAG, and recommendations even further.
To get an impression of Weaviate under the hood, check out this Architectural Deep Dive from Co-Founder and CTO Etienne Dilocker at the CMU Database Group - ML⇄DB Seminar Series (2023).
What to expect
- Paid, full-time internship at one of the most exciting companies in the database and AI space.
- Post-internship career opportunities at Weaviate.
- Make meaningful contributions that change how our users use Weaviate (see example project list below).
- Work directly with experts in Databases, AI, IR, etc.Low-level work possible: Weaviate’s core components (LSM stores, vector indexes, filtering and inverted indexes, networking, replication, etc.) are all written in-house.
- Work entirely remotely from anywhere in the world.
- Participate in company off-site events during your internship (Note: There is no guarantee that an off-site event will happen during the course of the internship. If it does, you’re part of it).
Qualifications
- Programming Languages:
- For Core Database Development: Experience with statically typed languages such as C/C++, Go, Rust, or similar. (Note that Weaviate is written in Go; prior Go experience is preferred but not strictly required)
- For ML Research and Fine-tuning: Proficiency in Python, experience with PyTorch, and knowledge of CUDA
- Preferred Coursework: Database-related courses, Information-Retrieval-related courses, System Design, adv. Algos and Data structures.
- A strong sense of ownership of your work, ability to work independently
- Kindness and excellent communication skills
Eligibility for an internship:
-
For U.S. Citizens:
- You must have all necessary documentation to be legally employed in the United States (e.g. Social Security Number) before commencing the internship.
-
For Foreign Students:
- You must have a valid student visa (e.g. F-1) in place at the time of application.
- You are responsible for securing the appropriate work authorization (e.g. CPT, OPT) prior to the start of your internship. This includes coordinating with your educational institution to obtain the necessary approvals and documentation.
- Please note that employment is contingent upon receiving proper authorization, and you will be required to provide proof of work eligibility before commencing the internship.
Example projects
Below is a list of example projects by their respective area. If you have your own project ideas that fit with the general theme of projects, please let us know. When applying, please specify your top 3 project preferences.
Storage & Networking
- Improve Filter performance at scale (roaring bitmaps, inverted indexes, etc)
- Design and implement new index types that can be used with various storage backends (NVMe, network disk, cloud storage)
- Optimize hot-path code in the storage engine (Memory allocations and usage, Disk I/O, SIMD, etc.)
- Use start-of-the-art data structures to improve the performance and storage footprint of existing index types
- Improve inter-node network communication by using lightweight and optimized network protocols.
Vector Indexes / Graph Algorithms
- Improve query performance by (approximating distances via FINGER, early exit conditions, etc.)
- Improve range filter performance for vector search (time-series vector search, SeRF, etc)Improve general filtered vector-search performance (ACORN, etc.)
- Design and implement a new cloud-storage-based vector index for specific scenarios
Information Retrieval, RAG, End-to-end applications
- Fine-tune embedding models and rerankers for specific domains (adaptive fine-tuning, synthetic data generation, ColBERT)
- Video embedding models and natural language search over video. (CLIP, contrastive learning)
- Develop new hybrid recommendations systems, combining collaborative filtering with content-based approaches (autoencoders, GNN, Matrix completion)
- Optimize Model Inference (vLLM, TensorRT)
- Weaviate as a library: Investigate and develop library usage of Weaviate via focusing on embedding Weaviate into the popular Golang-based Ollama inference service
Are you interested?
Have a look at this page to learn what you can expect from our interview process. Be aware that conducting a background check is part of our onboarding.
If you are interested in Weaviate and this role, you can apply via the ‘apply now!’ button below. All of our communication will be done in response to your application. If you have any questions feel free to reach out to our recruiter via the application. In this way, we ensure that our people can focus on doing their best work.
- Department
- Database
- Remote status
- Fully Remote
- Employment type
- Internship
About Weaviate
Weaviate is an AI startup with open source at its core. Our AI-native vector database uses machine learning to create meaningful insights from unstructured data in a completely new way. Named one of Forbes’ Top 50 AI startups, and with over a million monthly downloads, Weaviate is quickly growing in popularity with developers and enterprises alike.
Software Engineer Intern (Summer 2025)
Loading application form
Already working at Weaviate?
Let’s recruit together and find your next colleague.