Want to write articles/new answer to be published in javapedia.net? Send in your article with your name to javatutorials2016@gmail.com!

Prev Next

BigData / Apache Spark

Explain vectorAssembler in MLlib.

VectorAssembler is a transformer that combines a given list of columns into a single vector column.

VectorAssembler accepts the following input column types: all numeric types, boolean type, and vector type. In each row, the values of the input columns will be concatenated into a vector in the specified order.

scala> val vaDF = spark.read.option("multiLine",true).json("vectorAssemblerTest.data")
vaDF: org.apache.spark.sql.DataFrame = [id: bigint, mobile: double ... 3 more fields]
scala> vaDF.show
| id|mobile|otherData|time|     userFeatures|
|  1|   1.0|      yes|  18|[0.0, 11.0, 12.0]|

scala> import org.apache.spark.ml.feature.VectorAssembler
import org.apache.spark.ml.feature.VectorAssembler

scala> val assembler = new VectorAssembler()
assembler: org.apache.spark.ml.feature.VectorAssembler = vecAssembler_dbd3d0a8c760

scala> val assembler = new VectorAssembler().setInputCols(Array("id","mobile","time")).setOutp
assembler: org.apache.spark.ml.feature.VectorAssembler = vecAssembler_65938f964d7f

scala> val output = assembler.transform(vaDF)
output: org.apache.spark.sql.DataFrame = [id: bigint, mobile: double ... 4 more fields]

scala> output.show
| id|mobile|otherData|time|     userFeatures|outputVectorColumn|
|  1|   1.0|      yes|  18|[0.0, 11.0, 12.0]|    [1.0,1.0,18.0]|


Invest now!!! Get Free equity stock!

Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.

The Robinhood app makes it easy to trade stocks, crypto and more.

More Related questions...

Show more question and Answers...


Comments & Discussions