Spark3. 2 tutorial (VII) developing spark SQL with Java under idea

Miss Zhu 2022-02-13 08:37:01 阅读数:616

spark3. spark tutorial vii developing

         In the last article , We used Scala Language call Spark SQL The interface is developed , In this article we use Java Language for processing the same business functions , Still right JSON、Txt Text processing .
        JSON and Txt The contents of the document are as follows :

{
"name":"Michael"}
{
"name":"Andy", "age":30}
{
"name":"Justin", "age":19}
Michael, 29
Andy, 30
Justin, 19

         Introducing dependencies into a project :

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.13</artifactId>
<version>3.2.0</version>
</dependency>

        Java Handle JSON Code :

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
public class TestSQL {

public static void main(String[] args) {

SparkSession spark = SparkSession
.builder().master("local")
.appName("Java Spark SQL basic example")
.config("spark.some.config.option", "some-value")
.getOrCreate();
Dataset<Row> df = spark.read().json("file:///d:/test_spark/people.json");
df.show();
df.createOrReplaceTempView("people");
Dataset<Row> sqlDF = spark.sql("select * from people where age>20");
sqlDF.show();
}
}

        Java Handle Txt Code , I need to define a Person Entity class :

public class Person {

private String name;
private long age;
public String getName() {

return name;
}
public void setName(String name) {

this.name = name;
}
public long getAge() {

return age;
}
public void setAge(long age) {

this.age = age;
}
}
import com.alan.entity.Person;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.sql.*;
public class TestText {

public static void main(String[] args) {

SparkSession spark = SparkSession
.builder().master("local")
.appName("Java Spark SQL basic example")
.config("spark.some.config.option", "some-value")
.getOrCreate();
JavaRDD<Person> peopleRDD = spark.read()
.textFile("d:/test_spark/people.txt")
.javaRDD()
.map(line -> {

String[] parts = line.split(",");
Person person = new Person();
person.setName(parts[0]);
person.setAge(Integer.parseInt(parts[1].trim()));
return person;
});
// Apply a schema to an RDD of JavaBeans to get a DataFrame
Dataset<Row> peopleDF = spark.createDataFrame(peopleRDD, Person.class);
// Register the DataFrame as a temporary view
peopleDF.createOrReplaceTempView("people");
// SQL statements can be run by using the sql methods provided by spark
Dataset<Row> teenagersDF = spark.sql("select * from people where age>20");
teenagersDF.show();
}
}
copyright:author[Miss Zhu],Please bring the original link to reprint, thank you. https://en.javamana.com/2022/02/202202130836594751.html