Available for contract

Felipe
Carlos

Full-stack engineer working across the full Earth Observation lifecycle, processing data, preserving knowledge, and building AI tools that help people explore the EO landscape.

About

A note in plain language.

I'm a Full-Stack engineer with a master's in Applied Computing for Reproducible EO Applications from the Brazilian National Institute for Space Research (INPE). I'm a natural problem solver and a software craftsman, shipping high-quality software to solve real-world problems. I mostly write solutions in Python, R, and TypeScript, and sometimes play with C++ and Common Lisp.

My work spans the full Earth Observation lifecycle, from processing raw data, to preserving data and knowledge, to building AI systems that help people explore what has been preserved. I currently work on the sits project, designing highly scalable solutions to process big EO data, and consult across projects and organizations to strengthen their digital preservation and search capabilities. A common thread runs through this work: agent-based architectures built with LangChain, LangGraph, and MCP, paired with RAG systems, to change how people search and interact with EO data and knowledge.

Previously, I worked closely with the GEO community, especially the GEO Knowledge Hub, where I supported the development of core modules such as the Knowledge Package management system, the file system for Knowledge Packages, and REST APIs for spatial search.

São Paulo, Brazil · pt-BR · en

Stack

Every layer earns its place.

Every tool here has replaced something that didn't work. The stack reflects decisions made under real constraints. Not defaults, not trends.

  • Languages: Python, TypeScript, R, C++.
  • Geospatial: sits, GDAL, GEOS, terra, rasterio, leaflet.
  • Data: pandas, polars, numpy, xarray, pydash.
  • Web: Flask, FastAPI, React, Tailwind.
  • Infra: Docker, Terraform, AWS, pgVector, Pinecone, Cohere, OpenSearch.
  • Observability: LangFuse, structlog, Grafana.
  • Providers: Anthropic, OpenAI, Ollama, vLLM, llama.cpp.
  • Orchestration: LangGraph, LangChain, MCP, Celery.
Stack — isometric chipLayered categories of tools rendered as raised blocks on a chip substrate, with PCB-style traces between related layers.LanguagesPythonTypeScriptRC++GeospatialsitsGDALGEOSterrarasterioleafletDatapandaspolarsnumpyxarraypydashWebFlaskFastAPIReactTailwindInfraDockerTerraformAWSpgVectorPineconeCohereOpenSearchObservabilityLangFusestructlogGrafanaProvidersAnthropicOpenAIOllamavLLMllama.cppOrchestrationLangGraphLangChainMCPCelery
Experience

Selected roles, recent first.

Various organizationsFeb 2024 → Present
Consultant Full-Stack Engineer · Remote

Consult across organizations to strengthen their digital preservation and search capabilities. Build platforms, design metadata standards, and develop agent-based architectures, paired with RAG systems (Semantic Search, ReRankers) and modern search backends like Typesense and Neo4J, to improve how people interact with preserved data and knowledge.

PythonFlaskInvenioRDMLangChainLangGraphMCPRAGllama.cpp / vLLMOpenSearchpgVectorReactTailwind / Shadcn UI
e-sensingFeb 2024 → Present
Open-Source Software Developer · Remote

Maintainer of the sits R package, contributing across the full codebase. My work includes implementing algorithms for classification, clustering, and aggregation, among many other features. We manage cloud infrastructure to process Big EO data. I also maintain the package's Python wrapper and help build the project's AI-powered documentation assistant, making it easier for users to explore sits' capabilities.

RPythonC++Time-seriesEO DatacubesSTACML/DLGPU ComputingAWSAzureRAG
Group on Earth Observations (GEO)Mar 2022 → Apr 2024
Consultant Full-Stack Developer · Remote

Core developer of the GEO Knowledge Hub (GKH), an InvenioRDM-based digital library serving the GEO community. Designed the platform from the ground up and implemented its core modules, including the Knowledge Package management system, comments and feedback, spatial search, and React-based interfaces. Beyond the main GKH, I supported the conceptualization and development of the National GKH, a dedicated tool that helps countries build their own GKH instances.

PythonFlaskInvenioRDMSemantic UIReactPostgreSQLOpenSearchPreservationGeospatialNext.jsTailwind / Shadcn UI
Brazil Data Cube · INPEJan 2020 → Feb 2022
Associate Collaborator · São Paulo, Brazil

Reproducibility maintainer for the project's research outputs. Maintained shared JupyterHub and RStudio environments and authored technical documentation across the Brazil Data Cube stack.

JupyterHubRStudioReproducibility
Education

Formal training.

Technology College of São Paulo (FATEC) 2016 – 2019
Tech., Systems Analysis & Development

Developed ICan.js, a JavaScript library for web accessibility powered by deep learning. I was awarded the Academic Merit Award for best student performance.

National Institute for Space Research (INPE) 2020 – 2022
M.Sc., Applied Computing

Proposed a platform to support the manage of reproducible geospatial applications. Implemented using Python, Flask, and InvenioRDM.

Selected Projects

Three systems, one domain.

Selected projects I've contributed to across the EO space, including a digital library for preserving EO knowledge, an end-to-end toolkit for LULC classification, and an agent that enhances search experiences in technical documentation.

GEO Knowledge Hub - Digital library for the GEO Community

2022–2024 Production
Project

Open digital library serving the Group on Earth Observations (GEO). Preserves Knowledge Packages, reproducible bundles of data, papers, code, and makes them discoverable, citeable, and reviewable.

Role

Core developer. Designed the platform from the ground up on top of InvenioRDM and implemented the custom modules that distinguish it from a generic repository.

Custom modules
  • Knowledge Package management system
  • Comments and feedback
  • React pages and spatial search
  • National GKH module
Stack
Python Flask InvenioRDM React PostgreSQL OpenSearch Next.js Typescript
Status

In production since 2022 and running; source open under MIT.

View platform architecture Show Hide
GEO Knowledge Hub architecture Two layers. On top, React surfaces (landing, deposit, authoring) call directly into InvenioRDM. Underneath, the InvenioRDM platform — records, communities, storage — is augmented from within by the custom modules we built: Knowledge Packages, Comments & Feedback, and Spatial Search. The modules are plugins that extend the platform, not middleware between the UI and InvenioRDM. React surfaces — landing pages, deposit flow, Knowledge Package authoring UI React surfaces Landing · Deposit · Authoring InvenioRDM platform — records, communities, storage; the open-source foundation that the GEO Knowledge Hub extends from within via custom plugins. InvenioRDM Records · Communities · Storage Custom modules built into the GEO Knowledge Hub: Knowledge Package management, comments and feedback, and spatial search. They extend InvenioRDM from within rather than sitting between the UI and the platform. Custom modules — features we layered into the platform Knowledge Packages REST API · Schema Comments & Feedback Curation API Spatial Search OpenSearch · Bbox

Platform architecture — custom modules atop the InvenioRDM foundation.

sits - Satellite Image Time Series Analysis for Earth Observation Data Cubes

2022–Present Open source
Project

An end-to-end toolkit for land use and land cover classification using big Earth observation data. Used in production by national space agencies and research institutions.

Role

Contributor across the entire codebase, which includes R API, C++ internals, Python wrapper (pysits).

Activities
  • Implementation and enhancement of algorithms available in the package
  • Performance improvements using C++
  • pysits Python wrapper via rpy2 and arrow
Stack
R Python C++ terra arrow ggplot GDAL
Links
View language layers Show Hide
sits dependency layers Three layers, top to bottom: Python · pysits sits at the top and calls into the R API via reticulate; the R API · sits in the middle calls into C++ kernels; C++ kernels (Armadillo) form the performance foundation. Python · pysits — rpy2 / arrow wrapper Python · pysits rpy2, arrow R API · sits — public interface; delegates to C++ kernels R API · sits Public interface — what users call C++ kernels — Armadillo, performance-critical (Bayesian smoothing, classification); the foundation under both R and Python entry points C++ kernels Armadillo, performance-critical Bayesian smoothing Performance foundation

Layered dependencies — Python wraps R, R delegates to C++.

sitsrag - Agentic RAG over the sits R package

2025-Present Research
Pattern

Retrieval-augmented agent over the documentation and source of the sits R package, created using LangChain, FastAPI + Agent Protocol, Shadcn, and Next.js

State

User question → retrieved doc chunks (dense + lexical) → reranked context → grounded answer with citations.

Why a graph

Routes between retrieval, clarifying questions, and tool-mediated lookups against package internals. Answers stay grounded in the actual sits codebase.

Providers

Anthropic · OpenAI · Cohere

Stack
Python FastAPI pgVector Langfuse OpenAI embeddings FlashRank / Cohere rerankers Source-aware citation extractor
Writing & Talks

Words and stages.

Selected Publications
2026
On Benchmarks For Satellite Image Machine Learning (Accepted)
International Geoscience and Remote Sensing Symposium (IGARSS) 2026
2024
Bayesian inference for post-processing of remote-sensing image classification
Remote Sensing
2022
Sharing and preserving GEO community applications through the GEO Knowledge Hub
AGU Fall Meeting 2022
Webinars
2026
Governance, Licensing & Ethics for Earth Intelligence
GEO Data and Knowledge Webinar series
2026
Youth Webinar: What is the GEO Knowledge Hub?
GEO Data and Knowledge Webinar series
2025
GEO Knowledge Hub webinars (6 episodes)
GEO Knowledge Hub Webinar Series
Events

In-person, on the ground.

  1. Open Data Open Knowledge Workshop 20252025
    Organizer · Rome, Italy

    Organized and presented at ODOK 2025

    Event link
  2. Open Data Open Knowledge Workshop 20242024
    Organizer · Hangzhou, China

    Organized and presented at ODOK 2024

    Event link
  3. GEO Week 2023Nov 2023
    Speaker · Cape Town, South Africa

    Spoke on the From Data to Open Knowledge implementation session and organized several Showcase sessions across the week.

    Event link
08 — Closing note

Building the unglamorous infrastructure where AI and EO meet

Fastest by email. I read everything.