• Monitoring & Status

  • Public Status Dashboard

    • URL: https://cybernode.ai
    • The status page shows real-time health of all public services with 90-day uptime history.
  • What’s Monitored

    • Endpoint Health

      • EndpointCheck
        RPCHTTP 200 + valid response
        LCDHTTP 200 + valid JSON
        GraphQLQuery execution success
        IPFS GatewayContent retrieval
        cyb.aiPage load + content check
    • SSL Certificates

      • All endpoints are monitored for SSL certificate expiry with alerts 30 days before expiration.
    • Blockchain Sync

      • Block height is monitored to detect if nodes fall behind the network.
    • IBC Relayer

      • Wallet balances and packet relay success rate are monitored.
  • Dashboards

  • Metrics Stack

    • ComponentPurpose
      PrometheusTime-series metrics collection
      GrafanaVisualization and alerting
      Blackbox ExporterHTTP/SSL endpoint probing
      Node ExporterServer hardware metrics
  • Alert Categories

    • Infrastructure Alerts

      • Disk space, RAM, CPU, system load
      • Block counter stalls (node not producing blocks)
      • ZFS pool health
      • GPU status (required for consensus)
    • Service Alerts

      • API endpoint availability
      • SSL certificate expiry
      • IPFS gateway responsiveness
      • IBC relayer wallet balance
  • Uptime Targets

    • ServiceTarget
      RPC/LCD99.9%
      GraphQL99.5%
      IPFS Gateway99%
      cyb.ai99.9%
  • Incident Response

    • Alerts are routed to the infrastructure team via Telegram.
    • Critical services auto-restart on failure.
    • ZFS snapshots enable quick rollback if needed.
  • Historical Data

    • Prometheus retains 90 days of metrics history, enabling:
      • Trend analysis
      • Capacity planning
      • Post-incident investigation