Redmond, WA, US
22 hours ago
Data Engineer III, Amazon Leo AI Foundations
Amazon Leo is Amazon’s low Earth orbit satellite network. Our mission is to deliver fast, reliable internet connectivity to customers beyond the reach of existing networks. From individual households to schools, hospitals, businesses, and government agencies, Amazon Leo will serve people and organizations operating in locations without reliable connectivity.

This role is for a Data Engineer who will design, implement, and operate globally distributed systems that enable Leo to achieve low single-digit-second query responses within a near real-time analytics layer or lakehouse, and to support agentic AI capabilities on top. You’ll build these systems using the latest AWS technologies and best-in-industry data engineering practices.

Export Control Requirement:
Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum.

Key job responsibilities
- Architect and implement a scalable, cost-optimized S3-based Data Lakehouse that unifies structured and unstructured data from disparate sources.
- Architect and implement a scalable, cost-performance-optimized OLAP-based analytics layer
- Establish metadata management with automated data classification and lineage tracking.
- Design and enforce standardized data ingestion patterns with built-in quality controls and validation gates.
- Architect a centralized metrics repository that becomes the source of truth for all Leo metrics.
- Implement robust data quality frameworks with staging-first policies and automated validation pipelines.
- Design extensible metrics schemas that support complex analytical queries while optimizing for AI retrieval patterns.
- Develop intelligent orchestration for metrics generation workflows with comprehensive audit trails.
- Lead the design of semantic data models that balance analytical performance with AI retrieval requirements.
- Implement cross-domain federated query capabilities with sophisticated query optimization techniques.
- Architect a globally distributed vector database infrastructure capable of managing billions of embeddings with consistent sub-100ms retrieval times.
- Design and implement hybrid search strategies combining dense vectors with sparse representations for optimal semantic retrieval.
- Establish automated compliance validation frameworks ensuring data handling meets Amazon's security standards.

A day in the life
This role is for a Data Engineer who will build new cloud services and APIs that facilitate and orchestrate the Leo AI Foundations—enabling intelligent software operation across Leo devices such as satellites, ground gateways, and customer terminals. You will design and deliver low-latency, highly scalable architectures that are critical to providing high-quality internet service and AI capabilities to customers.

About the team
Leo AI Foundations builds the intelligent cloud backbone that powers AI-driven decision making across Amazon’s Leo constellation — from satellites and ground gateways to customer terminals. Our team designs and operates large-scale data and compute systems that enable training, inference, and agentic intelligence for optimizing network performance, routing, and user experience in real time. We combine expertise in distributed systems, data lakehouse architectures, and applied machine learning to deliver scalable, low-latency AI capabilities that integrate seamlessly with Leo's software-defined space and ground systems. We move fast, innovate boldly, and work across boundaries to make global connectivity smarter and more efficient.
Confirmar seu email: Enviar Email