Back to jobs
New

Senior AWS Platform Engineer

Argentina / Perú

 

Santex is a global Certified B technology company as well as one of North America's fastest-growing companies, according to the Inc. 5000. It’s present in 18 countries and over 100 cities, and headquartered in Argentina with offices in the United States, Mexico, and Peru. Its two service lines — AI Tech Consulting and AI-Optimized Engineering — help organizations comprehensively adopt technology, expand their capabilities, and improve business outcomes at scale.

The same approach defines its culture: Santex has 80.4% employee engagement (global average: 21%), 8% annual turnover (tech sector: 20–25%), 9,369 training hours invested in 2025 — more than double the global average —, and a 26.5% year-over-year reduction in carbon footprint, achieving carbon neutrality through offsetting.

Santex. A company with purpose and judgment. Yes, it's possible.

 

We are an Equal Opportunity Employer and are committed to fostering an inclusive and diverse workplace. We do not discriminate on the basis of race, color, religion, sex, gender identity or expression, sexual orientation, national origin, age, disability, veteran status, genetic information, or any other characteristic protected by applicable law. All qualified applicants will receive consideration for employment without regard to any of these factors. We strongly encourage candidates from all backgrounds to apply.

About the role
Initial project is replacing a 15-year-old Solr search index — and the surrounding integration layer — with an AI-native retrieval and generation platform on AWS Bedrock + OpenSearch. The new platform serves 9 AI use cases across a 23-site WordPress multisite running on Cloudways (GCP), with on-prem SQL Server data flowing in via Mule 4 ESB, attachments in S3, and a Salesforce-sourced membership of ~45,000 executives.
You own the AWS infrastructure that makes all of it work. OpenSearch cluster ops, Bedrock integration, Lambda + SQS event-driven ingestion, API Gateway for the unified publisher contract, IAM topology across accounts, cost guardrails that hold against a published $10–18k/yr ceiling at 10x current Solr volume, and the observability story that lets Product see prompt/response traffic in production.
You'll work alongside an external architecture partner (Santex Lab) during Phase 1 for knowledge transfer. The pattern is clear: Santex leads weeks 1–2, you pair weeks 3–8, you lead weeks 9–16, you own from week 17. By the time Foundation goes GA, the internal team owns the platform — not the consultancy.
 
What you'll own
OpenSearch Managed Service
Cluster sizing, instance class selection, node count for the realistic + 10x growth scenarios.
Index design (content / profile / metadata indices), mapping per index, k-NN configuration.
Snapshot/restore strategy, upgrade cadence, capacity planning.
Hybrid query construction with the AI/RAG engineer (you own the cluster; they own the retrieval logic).
RBAC pre-filter performance — this is the integration point where your work meets theirs.

Bedrock integration
Model access provisioning, regional routing, quota management.
Bedrock Knowledge Bases for the Chair Assistant + member-facing use cases.
Bedrock Guardrails configuration framework (the AI/RAG engineer defines policies; you operationalize them).
Cost monitoring per model, per use case, per tenant where applicable.

Event-driven ingestion
API Gateway in front of a single SQS queue — the "unified queue" pattern that lets cloud sources (cloned PISO Solr plugin, S3 attachment cascade) and on-prem sources (BI signal publishers from SQL Server) publish to the same envelope schema.
Lambda consumers per content shape (content / attachments / BI signals). DLQ handling, partial failure semantics, replay tooling.
Concurrency tuning, cold-start mitigation, throttling at the right boundaries.
Signed envelope validation at the API Gateway edge.

IAM and multi-account topology
Account strategy for AI workloads — dedicated AI account, isolation from production WordPress, SSO/role mapping.
Cross-account access patterns where data crosses boundaries (S3, Aurora, Bedrock).
Least-privilege roles for every Lambda, every service-to-service call. Policy review cadence.
SOC2 readiness work alongside internal IT (account provisioning, audit logging, network controls).

Network
VPC design for AI workloads.
On-prem connectivity for the Phase 4 Meeting Prep CDC pipeline (first time the internal organization runs Python tooling against on-prem SQL Server in production). VPC peering, security review, hybrid networking.
Public/private surface separation — API Gateway authenticated boundary vs internal-only Lambda paths.

Observability and cost
CloudWatch dashboards spec — what to monitor, alert thresholds, escalation paths.
Per-prompt/per-response capture pipeline (a Phase 1 deliverable Product team requires for visibility).
Cost monitoring — Bedrock per-model spend, OpenSearch RI optimization, Lambda invocation cost, S3 storage classes.
Budget alerts wired to throttle rules + model downgrade triggers so cost overruns are mechanical, not political.

Operational data services
Aurora Postgres or S3 Parquet for the Meeting Prep BI signal store (depending on the ETL-vs-realtime decision pending Q4 2026).
DynamoDB for the Meeting Prep cache layer (1hr TTL, single-table design).
S3 dual-bucket attachment pattern — private bucket via REST proxy with auth token, public bucket via direct Nginx URL.
 
Must-have qualifications
7+ years of cloud infrastructure / platform engineering. 4+ years of which is hands-on AWS at production scale.
Deep OpenSearch (or Elasticsearch) operational experience. Cluster sizing, index design, snapshot/restore, query optimization, k-NN configuration. You've sized a cluster that didn't fall over.
Production Lambda + SQS event-driven systems — DLQ patterns, idempotent consumers, partial-failure handling, concurrency tuning, cold-start mitigation. You know the operational gotchas, not just the docs.
API Gateway in production — auth, request validation, throttling, signed payloads. You can defend your auth choices.
IAM at scale — multi-account, cross-service, least-privilege design. You've designed an IAM topology that survived contact with a security audit.
CloudWatch + observability — dashboards, alarms, log aggregation, slow-query analysis. You know when to reach for X-Ray, when CloudWatch Logs Insights is enough, and when you need a third-party.
AWS cost engineering — Cost Explorer, budget alerts, RI/Savings Plans, per-service waste detection. You've cut someone's AWS bill in half without breaking anything.
Infrastructure-as-Code — CloudFormation, CDK, or Terraform. We'll commit on a stack with you on day one.
Strong written communication. Runbooks, ADRs, IAM policy rationale, incident reports. The team transitions to internal ownership — the docs are the deliverable as much as the infrastructure.

Strongly preferred
AWS Bedrock production experience — model invocation, Knowledge Bases, Agents, Guardrails. Quota management, region availability, model selection economics. If you've shipped on Bedrock, ramp time drops significantly.
Compliance work — SOC2, HIPAA, or comparable. You've been through an audit. You know what evidence looks like.
Hybrid cloud / on-prem connectivity — VPC peering, Direct Connect, Site-to-Site VPN, security review processes. The Meeting Prep phase needs this.
CDC / change data capture pipelines — Debezium, native SQL Server CDC, or comparable. Phase 4 design depends on it.
Multi-account AWS Organizations setup — SSO, control tower, account vending, cross-account roles. You've stood one up, not just used one.
Experience supporting a Solr-to-OpenSearch migration — parallel-run cutover, traffic shifting, deprecation triggers. (Phase 2 is exactly this.)

Nice to have
Aurora Postgres operational experience (schema migration, backups, query tuning).
DynamoDB single-table design and capacity-mode tuning.
Experience operating on top of Cloudways (or any managed-WordPress host) for the integration boundary.
Familiarity with Mule 4 ESB as an integration boundary (you won't operate it — the internal team does — but you'll integrate against it).
Background in regulated or enterprise SaaS (financial services, healthcare, legal, executive education).
Familiarity with Bedrock Agents and where they help vs where they force fallback to custom orchestration.
 
Advanced English:
Excellent verbal and written communication skills in English.

Create a Job Alert

Interested in building your career at Santex? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...
English Level *