Exam Professional Data Engineer topic 1 question 286 discussion - ExamTopics


This article presents a multiple-choice question regarding migrating Apache Spark jobs from an on-premises Hadoop cluster to Google Cloud using managed services, with minimal code changes and a tight deadline.
AI Summary available β€” skim the key points instantly. Show AI Generated Summary
Show AI Generated Summary

You have thousands of Apache Spark jobs running in your on-premises Apache Hadoop cluster. You want to migrate the jobs to Google Cloud. You want to use managed services to run your jobs instead of maintaining a long-lived Hadoop cluster yourself. You have a tight timeline and want to keep code changes to a minimum. What should you do?

  • A. Move your data to BigQuery. Convert your Spark scripts to a SQL-based processing approach.
  • B. Rewrite your jobs in Apache Beam. Run your jobs in Dataflow.
  • C. Copy your data to Compute Engine disks. Manage and run your jobs directly on those instances.
  • D. Move your data to Cloud Storage. Run your jobs on Dataproc.
Show Suggested Answer Hide Answer
Suggested Answer: D πŸ—³οΈ

🧠 Pro Tip

Skip the extension β€” just come straight here.

We’ve built a fast, permanent tool you can bookmark and use anytime.

Go To Paywall Unblock Tool
Sign up for a free account and get the following:
  • Save articles and sync them across your devices
  • Get a digest of the latest premium articles in your inbox twice a week, personalized to you (Coming soon).
  • Get access to our AI features

  • Save articles to reading lists
    and access them on any device
    If you found this app useful,
    Please consider supporting us.
    Thank you!

    Save articles to reading lists
    and access them on any device
    If you found this app useful,
    Please consider supporting us.
    Thank you!