Mid-Senior Data Scientist (GenAI + Machine Learning)
Empresa não especificada • Lisboa
Tempo inteiro
Outros
Descrição da Vaga
**About the Role**
We are looking for a versatile Data Scientist to lead our AI initiatives, bridging the gap between classical Machine Learning and cutting\-edge Generative AI. You will contribute to the development of StarkVision \- our autonomous AI agent platform \- and maintain our core predictive models for Churn, Segmentation, Forecasting and others. This is a high\-impact role where you will not only build models but also architect complex multi\-agent systems that interact directly with our users and databases.
**Key Responsibilities**
1\. Generative AI \& Autonomous Orchestration
* Agent Orchestration: Design and optimize multi\-agent workflows. You will manage specialized agents (Database Admin, Report Agent, Cloud Specialist) to handle complex user requests.
* Text to SQL: Enhance our Text\-to\-SQL capabilities, ensuring accurate translation of natural language into complex SQL queries using LLMs (GPT\-4, Gemini, Grok).
* RAG \& Memory: Maintain and improve our long\-term memory systems using Vector Databases (ChromaDB) and RAG pipelines to provide context\-aware responses.
* Tool Integration: Develop and maintain MCP (Model Context Protocol) tools that allow agents to interact with external services (OneDrive, SQL Databases, PDF parsers).
2\. Machine Learning \& Predictive Modeling
* Churn Prediction: Maintain and improve our deep learning Churn Model built with TensorFlow/Keras (LSTMs). You will handle 3D time\-series data construction and model optimization.
* Customer Segmentation: Refine our clustering pipelines using Scikit\-Learn and Faiss (for GPU\-accelerated clustering). Implement advanced feature selection techniques and hybrid clustering approaches (K\-Means \+ Hierarchical)
* Forecasting: Manage time\-series forecasting modules for sales and demand prediction.
3\. Backend \& Deployment
* Production Engineering: MDeploy models and agents within our Flask application, utilizing Redis and RQ (Redis Queue) for asynchronous background processing.
* Data Engineering: Write efficient SQL queries and Python scripts (Pandas, SQLAlchemy) to preprocess large datasets for both ML training and agent consumption.
4\. Other
* Company Workflows: Assist in requirements definition, making sure everyone is on the same page.
**Requirements**
* Bachelor's degree in computer science, Software Engineering, Computer Engineering, or related technical field.
* 3\+ years of experience in Data Science or Machine Learning Engineering.
* GenAI Expertise: Proven experience building LLM\-based applications. Familiarity with Agentic workflows and RAG architectures is a must.
* Deep Learning: Hands\-on experience with Neural Networks, specifically LSTMs/RNNs for time\-series data (TensorFlow or PyTorch)
* Strong Math/ Stat Foundation: Deep understanding of clustering algorithms, dimensionality reduction (PCA), and statistical forecasting.
* Coding Skills: Expert\-level Python. Comfortable writing production\-ready code, unit tests, and working with APIs.
* Database Skills: Strong SQL proficiency. You understand database schemas and can optimize queries.
**Nice to Have**
* Experience with Kubernetes and Helm Charts.
* Knowledge of Cloud Services.
* Experience with Text\-to\-SQL fine\-tuning or prompt engineering.
* Familiarity with Faiss for large\-scale similarity search.
* Knowledge of Model Context Protocol (MCP)
* Knowledge of Agent2Agent Protocols (A2A)
* Ability to handle ambiguity in complex requirements scenarios.
* Experience with on\-prem solutions.
**What We Offer**
* A great environment with real world challenges.
* Opportunity to work on high\-tech products integrating Analytics, AI and BI for Banking, Retail and Health.
* A modern technical environment (Modern Python, automated workflows).
* Hybrid work format.
Tipo de oferta: Integral/Full\-time
Benefícios:
* Cartão/Ticket refeição
Localização do trabalho: Presencial
Precisa de estar logado para se candidatar.
Login para Candidatar