Database / InfluxDb interview questions
InfluxDB is optimized for time stamped metrics and events. It is strong for high ingest workloads and time window analytics in observability and IoT systems.
Relational systems excel at normalized transactional data. InfluxDB is purpose built for append heavy measurements with timestamps and fast aggregate queries.
Measurement groups related points. Tags are indexed dimensions. Fields store actual values. Timestamp marks event time for each point.
A bucket stores time series points and defines retention behavior. It is the main logical container for data in InfluxDB v2.
Retention controls how long data stays in storage. It reduces cost and keeps long range queries efficient by removing stale high resolution data.
Line protocol is a compact write format using measurement tag set field set and timestamp. It is designed for fast ingestion.
Use tags for dimensions you filter often because tags are indexed. Keep numeric or measured values in fields.
Cardinality is the number of unique series from measurement and tag combinations. Very high cardinality increases memory and index pressure.
Avoid unique per event identifiers in tags. Keep stable tag sets and move volatile identifiers like request ids into fields.
Use Flux when you need richer transformations, scheduled processing, or more expressive pipelines across time series datasets.
InfluxQL is SQL like and familiar for many teams, especially for legacy query patterns and simpler time series retrieval.
A typical pattern uses from then range then filter to narrow data by time and measurement before aggregation.
InfluxDB v2 uses scoped API tokens with explicit read and write permissions. Production systems should rotate and store tokens securely.
IoT pipelines often micro batch points and retry transient failures. Stable device tags improve queryability and operational consistency.
Define acceptable lateness windows, keep clock synchronization, and run controlled backfill jobs for corrected records.
Downsampling stores long term trends at lower granularity so storage and long range query cost remain manageable.
Tasks run Flux scripts on schedules to automate rollups, data quality routines, and recurring transformations.
Telegraf is a plugin based metrics agent that collects from many sources and forwards data to InfluxDB.
Use stable dimensions like tenant environment and service. Avoid per request or per session identifiers in tags.
Use batch writes, efficient tag sets, optional compression, and nearby network paths. Validate retry and buffering settings in clients.
A field type conflict occurs when one field key receives different data types across writes. Prevent this with producer side validation.
Run regular backups, test restore drills, and align retention, recovery objectives, and operational runbooks.
Track write errors, ingest latency, query latency, cardinality trends, disk usage, memory pressure, and task failures.
Reduce time range first, filter early with indexed tags, and inspect expensive operations like joins and pivots.
Separate buckets by environment and lifecycle such as raw and rollup. This helps permissions, retention, and operations.
Store latency and error metrics with stable tags like service, endpoint class, and environment, then compute windowed rollups.
Avoid unbounded tag values, inconsistent field typing, and mixing unrelated domains in one measurement.
Edge nodes buffer and forward batches during reconnect. Consistent timestamps and idempotent writes reduce duplication risk.
Enable TLS, scope tokens, rotate secrets, segment network paths, and separate read and write credentials by workload.
Explain ingest path, storage behavior, indexing tradeoffs, retention lifecycle, query layer, and operational controls with practical examples.
