We are seeking a skilled and experienced Mid-level Datadog Dashboard Engineer. As a key member of our monitoring and observability team, you will be responsible for designing, developing, and maintaining advanced monitoring and visualization solutions using the Datadog platform. Your expertise will contribute to improving our client’s operational efficiency, incident response, and overall system performance.
- Collaborate with cross-functional teams especially DevOps teams, to understand monitoring and dashboard requirements.
- Design and creating comprehensive Datadog dashboards that provide real-time insights into system health, performance, and key performance indicators (KPIs)
- Customize and configure widgets, graphs, and alerts to effectively visualize and monitor metrics and logs
- Set up proactive alerts and notifications based on defined thresholds and conditions to quickly respond to performance degradations, downtime, or other critical events
- Adjust alerting rules to reduce false positives and ensure timely and accurate notifications to the operation team
- Work with the stakeholders to gather feedback on the existing dashboard, iterate on improvements, and create new dashboards to accommodate changing monitoring needs
- Provide training and guidance to junior team members and other stakeholders on using and creating Datadog dashboards effectively.
- Stay up to date with industry best practices and emerging trends in monitoring, observability, and Datadog features to continually enhance the monitoring capabilities.
- Bachelor’s or master’s degree in computer science or a related field.
- Proven experience of 3+ years in Datadog dashboard design and infrastructure implementation.
- Proficiency in querying and analyzing metrics using query languages (e.g., Datadog Query Language, PromQL)
- Strong understanding of infrastructure components, networking, and cloud environments
- Familiarity with scripting languages (e.g., Python, Bash) for custom metric collection and automation
- Solid grasp of system and application performance metrics, resource utilization, and troubleshooting methodologies
- Ability to translate technical requirements into effective monitoring solutions.
- Datadog certification or equivalent (e.g., Datadog Fundamentals)
To apply for this job email your details to email@example.com