Breaking News

Big data answers quiz 2

 i am not sure if all the answers are correct but you will pass definitely follow the questions below


Which of the library supportd real time streaming?

ANS:- Spark Streaming.


choose the function which is not transformation 

1.sample

2.Reduce

3.Cogroup

4.union

which of this not related to spark ecosystem?

Spark CTL


Which is the graph library fot spark?

1.Network

2.None of the above

3.NetworkX

4.GraphX


which of the following statements are "true "about Lazy Executed?

Ans- it is used in situations when it's not mandatory to execute a bunch of programs immediately.

what are diffrent memory storage levels?

Memory_only-ser

2.Memory_and_disk

3.all of the above

4.Memory_only

 

Worker Node can have more than one 1 worker?

True

What are the different Languages supported by the spark for big data application?

1.java

2.scala

3.R

4.Python

5.All of the above

What are the characteristics of an RDD?

1.Immutable

2.Partitioned

3.Resilient

4.All of the above


Choose the Spark Machine Learning library from below

1.MLlib

2.Mahout

3.BlinkDB

4.GraphX

Memory management, monitoring jobs, fault tolerance, job scheduling and interaction with storage systems is handled by


ans :- Spark Engine


What are operations provided by RDD?

1.Action

2.both 1&3

3.Transformation

4..None of the above

Which are the common Spark ecosystems?

1.SparkSQL

2.Spark Streaming

3.BlinkDB

4.All of the above


is reduce function an action?

1.yes

2.no

3.don't know


What are the responsibilities of Spark Engine?

1. Distributing

2.Scheduling

3.monitoring

4.All of the above

IS the transformation Executed before an action follows?

1.No

2.Yes

3.Don't Know.


say i have list of numbers in RDD(say myrdd) . i want compute

DefmuAvg(x,y)

Reutrn(X=y)/2.0

avg = myrdd.reduce(myAvg),

is there something wrong with it?


1.yes

2.no

3.Don't know

Choose the function which is not action 

1.Reduce

2.Take

3.Collect

4.Pipe

Which of the following are common spark ecosystems?

1.MLLib

2.Mahout

3.SparkSQL

4.both 1& 3


Which of this is not identifier  ?

1.Numeric identifiers

2. Alpha Numeric identifiers.

3.Operator Identifiers

4.Literal Identifiers

5Mixed Identifiers


Immutability feature of scala helps in

1.Equality issues

2.Sequential programs

3. concurrent programs

5.Non_equality issues

6.both 2&3

what is the easiest way to format a string?

1.call.format

2.all of the above

3.call.arrange()

4.call.formatstring()


Which all cluster amanger can be used with spark?

1.mesos

2.yarn

3spark stand alone

4.All of the above

Which one of the following is not part of stateless transformation?

1.Join

2.map

3.reducebykey

4.filter

What is the default for mesos web ui?

18080

2.8088

3.5050

4.none of the above

When you can join operation on two pair Rddseg(k,v)anf (k,w) what is the result?

1.(K,(v+w))pairs with all pairs of elements for each key

2.(K,(v,w))pairs with all pairs of elements for each key

3.(K,(v-w))pairs with all pairs of elements for each key

What are the keys used by wide transformation?

1.groupbykeys

2.map()

3.reducebykey

4.both 1& 3

Which one of the following command was not sent by the driver program to executors?

1.foreach

2.task

3.filter

4.map

Which of the following is not an example of creation of RDD using Spark Context?

1sc.paralleize(0 to 100)

2.sc.broadcast("hello")

3.sc.textFiles("README.md")

4.using sc.newAPIHHadoopFile


which spark library allows reliable file sharining at memory speed across different cluste franmeworks?

1.Tachyon

2.ByKey

3.map

4.reduceBykey()

Which of the following is not an operator in spark?

1.map90

2.mapred()

3.filter()

4.reduceBykey()

5.groupBykey()


which one of the following is not considered as block store?

1.RAM

2.memory

3.disk

4.offheap


Point out the Error in the following Code:

val conf = new SparkConf()

setMaster("local [1]") setAppName("CountingSheep") val sc=new SparkContext(conf)

Spark Context should be set first

 setMaster("local [1]") should be replaced by set("local [1])

"local[1]" means there is no parallelism

There is no Error


which of the following data sources spark can not process?

1.HDFS

2.cassandra

3.Hbase

4.My SQL

DAGSchduler uses event queue architecture. Ture/Flase

True

False

What is task with regards to spark job execution?

1.A Task also be considered in stage on a partition in a given job attempt

2.A Task belongs to single stage and oprates on single parition(part of an RDD)

3.Tasks are spawned one by one for each state and data partition

4.All of above

Sliding window controls transmission of data packets between various computers networks- Ture/Flase

True

Shuffling changes number of partitions-

False

How can you use machine learining library sckit library which is written in python with spark engine?

1can be used as pipeline API

2.using spark Mlib

3.using pipw()

4.All of the above

Which of the following is not transformation oprator on RDD?

1.Flatmap

2.reduceBykey

3.fork

4.cogroup

What is the advantage of parquet file

1.limit I?O OPerations

2consumes less space

3.fectches only requrired columns

4.all of the above

 Partitions of RDD can be controlled using which of the operations?

1.repartition

2.partition

3.coalesce

4.option 1 &3


which of following is not component of yarn?

1.Resource manager

2.Application master

3.Name node

4. name node container

which of the following method pair RDD use?

1.map

2.reduceBYkey

3.join

4.both b & c

what is the eeftectof setting up of spark driver allow multiple Contexts is true?

1.create multiple spark contexts in single JVM

2. Spark will log warining instead of throwing exceptions

3.you cab not set the propert to true

4.both 1& 2

Which of  the following is high availability schemes?

1.standby masters with Zoo keeper

2.Stand Masters in HDFS

3.single node recover with local file system

4.both 1& 3

how will you start mesos shuffle services?

1./bin/start-mesos-shuffle-service.sh

2./bin/mesos-shuffle-service.sh

3.sbin/start-mesos-shuffle-service.sh

4./sbin/start-mesos-shuffle-service.sh

which of the following operations the RDD Supports?

1.Transformation

2.action

3.addition

4.both 1 and 2

What is the purpose of friver in spark Architecture?


1.Driver splits spark application into task and schedules them to run on executors

2.A driver is where the task scheduler lives and spawns taks across workers

3.A Driver coordinates workers and overall execution of tasks

How to minimise data transfers when working with spark?

1.using brodcast variable

2.using Accumulators

3.Avoid operations which trigger shuffle

4. All os the above

What is the use of akka in spark?

1.combiner

2.reducer

3. scheduler

4.mapper


Which one of the following Mlib will not provide?

1.classification

2.Association

3. Regression

4.clustering

Which one of the following is not a property of RDD?

1. Preferred Locations

2. Partitioner

3.Compute

4.combiner


What is the level of security in spark?

1.Matured

2.Infancy

3.  Evolving

4. No Security


Apache spark is framework with?


1. scheduling

2. monitoring

3. Distributing Applications 

4.All of the above


In Spark data is represented as?

1.Blocks

2.Chunks

3. Rdds

4. None of the above


what are sparse vectors

Ans:- A sparse vector is a vector having a relatively small number of nonzero elements.


An adaptive optimisation framework that builds and maintains a set of multi-dimensional samples from original data over data time called as


1. Spark

2. Shark

3. BlinkDB

4. MapR

What is Graphx?

1. Library

2. Class

3. object

4. File


The collect AsMap() function collect the result as a map to provide easy lookup


True

False 


Which of the following is not a characteristic shared by hadoop and spark?

1. both are data processing platforms

2. both are cluster computing environments

3. Both have their own file system

4. Both use open source APIs to link between different tools





No comments