Hadoop submission task source code analysis

MelodyYN 2022-02-13 08:26:47 阅读数:560

hadoop submission task source code

hadoop Submission process source code

1、 Source flow

 Insert picture description here

// Get into Job Class waitForCompletion() Method 
waitForCompletion()
submit();
// 1 Establishing a connection 
connect();
// 1) Create submission Job Agent for 
new Cluster(getConfiguration());
// (1) Determine whether it is the local operating environment or yarn Cluster running environment 
initialize(jobTrackAddr, conf);
// 2 Submit job
submitter.submitJobInternal(Job.this, cluster)
// 1) Create a Stag route 
Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
// 2) obtain jobid , And create Job route 
JobID jobId = submitClient.getNewJobID();
// 3) Copy jar Packet to cluster 
copyAndConfigureFiles(job, submitJobDir);
rUploader.uploadFiles(job, jobSubmitDir);
// 4) Computed slice , Generate slice plan file 
writeSplits(job, submitJobDir);
maps = writeNewSplits(job, jobSubmitDir);
input.getSplits(job);
// 5) towards Stag Path write XML The configuration file 
writeConf(conf, submitJobFile);
conf.writeXml(out);
// 6) Submit Job, Return to submission status 
status = submitClient.submitJob(jobId, submitJobDir.toString(), job.getCredentials());

2、 Main items in the process of submitting tasks

 Insert picture description here

  1. stay connect In the method , Mainly through cluster Object provides an entry access mr The way of clustering . Get into Cluster in , Enter again initialize(jobTrackAddr,conf) Contained in the initProviderList();ProviderList There is YarnClient and LocalClient; adopt for Loop traversal initProviderList(), And verify the parameters .

    Through parameters mapreduce.framework.name To determine what environment runs
    If the value is yarn That's it yarn Environmental Science
    If the value is local That's it local Environmental Science

  2. Get the submitter through the current environment ,

    • Verify that the output path exists ;

    • Provide a staging Temporary directory ; produce jobID; Ready to create staging Temporary directory +jobID route

    • stay staging Temporary directory +jobID Upload in the temporary directory of Job.xml The configuration file 、 Slice information 、(jar package –yarn Pattern )

      Cluster pattern : Submit jar package

      Local mode : No submission jar package

copyright:author[MelodyYN],Please bring the original link to reprint, thank you. https://en.javamana.com/2022/02/202202130826446080.html