Prev Next

Integration / Apache NiFi Interview Questions

How does NiFi integrate with cloud storage services like Amazon S3?

NiFi provides a comprehensive set of processors for integrating with Amazon S3 available in the nifi-aws-nar extension.

ListS3: Lists objects in an S3 bucket (filtered by prefix and last modified date). Produces one FlowFile per S3 object with attributes: s3.bucket, s3.key, filename, s3.etag, s3.contentType, file.size, s3.lastModified. Uses State Management to track listed objects and emit only new or changed ones on subsequent runs.

FetchS3Object: Downloads the S3 object specified by the s3.bucket and s3.key attributes on an incoming FlowFile. Used downstream from ListS3.

PutS3Object: Uploads FlowFile content to S3. Supports configuring bucket, key (EL-supported: ${now():format('yyyy/MM/dd')}/${filename}), storage class (STANDARD, INTELLIGENT_TIERING, GLACIER), server-side encryption (AES-256, AWS:KMS), and multipart upload threshold for large files.

DeleteS3Object: Deletes an S3 object. TagS3Object: Adds or updates S3 object tags.

Authentication is handled via an AWSCredentialsProviderControllerService — supporting static credentials, environment variable resolution, EC2 instance profile (IAM role), and AWS credentials file. Using IAM roles via instance profiles is the recommended approach for deployments on EC2 or EKS, avoiding static credential management entirely.

What is the recommended AWS authentication approach for NiFi running on EC2, avoiding static credential configuration?
What PutS3Object feature automatically switches to multipart upload for large files?

Invest now in Acorns!!! 🚀 Join Acorns and get your $5 bonus!

Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!

Earn passively and while sleeping

Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.

Invest now!!! Get Free equity stock (US, UK only)!

Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.

The Robinhood app makes it easy to trade stocks, crypto and more.


Webull! Receive free stock by signing up using the link: Webull signup.

More Related questions...

What is Apache NiFi and what problem does it solve? What is a FlowFile in Apache NiFi? What are the three NiFi repositories and what does each store? What is a Processor in Apache NiFi and what are the main processor categories? What is a Connection in NiFi and how does back-pressure work? What is NiFi Expression Language and where can it be used? What is data provenance in Apache NiFi and how do you access it? What is a Process Group in NiFi and why is it used? What is NiFi Registry and how does it integrate with NiFi? How does NiFi clustering work and what is the role of ZooKeeper? What is a Controller Service in NiFi and how is it different from a Processor? What is the GenerateTableFetch and QueryDatabaseTable pattern for incremental database ingestion? What is the Record-based processing model in NiFi and why is it preferred? What is State Management in NiFi and what types of state scope exist? What is NiFi Site-to-Site (S2S) and when do you use it? What is NiFi and how does it relate to Apache NiFi? What is NiFi Parameter Context and how does it differ from Variables? How does NiFi handle security — TLS, authentication, and authorization? What is the NiFi NAR (NiFi Archive) classloading model? What are Reporting Tasks in NiFi and what are common use cases? How do you handle errors and failures in a NiFi flow? What is the SplitText processor and how do you control split behavior? What is the MergeContent processor and how is it used? What is the InvokeHTTP processor and what are key configuration considerations? What is the PublishKafka and ConsumeKafka processor pair and what are key configuration options? What is the ExecuteScript processor and what scripting languages does it support? What is the JoltTransformJSON processor and how do you use it? What is the PutDatabaseRecord processor and how does it differ from ExecuteSQL? What is the ListSFTP and FetchSFTP processor pattern and how does it work? What is the LookupRecord processor used for? What is the PartitionRecord processor and what is a common use case? What is the ConvertRecord processor and how is it used for format conversion? What are the NiFi processor scheduling strategies? What is the difference between EvaluateJsonPath and FlattenJson processors? How does NiFi integrate with Apache Hadoop and HDFS? What is the UpdateAttribute processor and how is its Advanced Mode used? How do you implement deduplication in a NiFi flow? What is the HandleHttpRequest and HandleHttpResponse processor pair used for? How does NiFi achieve guaranteed delivery and what are its durability guarantees? What is the Funnel component in NiFi and when do you use it? What is the difference between GetFile and ListFile + FetchFile processors? How does NiFi support schema evolution in data pipelines? What is the RouteText processor and how does it differ from RouteOnContent? What performance tuning options are available in NiFi and what are common bottleneck patterns? How does NiFi integrate with cloud storage services like Amazon S3?
Show more question and Answers...

Cloud

Comments & Discussions