Integration / Apache NiFi Interview Questions
How do you implement deduplication in a NiFi flow?
Deduplication — preventing the same data from being processed more than once — is a common requirement. NiFi provides several mechanisms depending on scale, performance requirements, and what constitutes a duplicate.
DetectDuplicate processor: The simplest approach. It uses a Distributed Map Cache (a DistributedMapCacheClientService backed by a DistributedMapCacheServer) to store seen identifiers. For each incoming FlowFile, it evaluates a configurable Cache Entry Identifier (NiFi EL expression, e.g., ${filename} or ${sha256.hash}) and checks if that key already exists. Duplicates route to the duplicate relationship; new items go to non-duplicate. Cache entries can have a TTL (Age Off Duration) to forget old identifiers.
Content-based hashing: Use the HashContent processor to compute a SHA-256 hash of the FlowFile content and store it as an attribute, then use DetectDuplicate against the hash. This detects content-identical duplicates regardless of filename or metadata.
Database deduplication: For at-exactly-once semantics, track processed identifiers in a database table using PutDatabaseRecord with INSERT_IGNORE and a unique constraint on the identifier column. The database's ACID guarantees enforce uniqueness even under concurrent insertion.
Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!
Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.
Invest now!!! Get Free equity stock (US, UK only)!
Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.
The Robinhood app makes it easy to trade stocks, crypto and more.
Webull! Receive free stock by signing up using the link: Webull signup.
More Related questions...
