Vendor | : | EMC |
Exam Code | : | E20-065 |
Exam Name | : | Advanced Analytics Specialist Exam for Data Scientists |
Questions and Answers | : | 66 Q & A |
Updated On | : | Click to Check Update |
PDF Download Mirror | : | E20-065 Brain Dump |
Get Full Version | : | Pass4sure E20-065 Full Version |
Which representation is most suitable for a small and highly connected network?
Edge list
Adjacency matrix
Eigenvector centrality
Adjacency list
What is a characteristic of spark?
Unable to run map -> reduce execution plans
Supports applications written in Python, Java, and Scala
Less efficient processing small files than Hadoop MapReduce
Supports workflows that can return to previous work steps
If two of the communities are re-designated to be one community, how does that change the network characteristics? Refer to the exhibit.
Neighborhood overlap would increase
Network diameter would decrease
Modularity would increase
Modularity would decrease
What is the maximum degree of a node in an undirected graph with 50 nodes'?
49
50
1250
2500
What best describes the meaning behind the phrase "Six Degrees of Separation'"?
Ability to use about six hops to reach any other node in an extremely large social network
Erdos number of all scholars having written papers with Paul Erdos
Maximum number of edges between nodes in a graph with a diameter of six
Typical distance between nodes that are connected by triadic closure
What do first-order and second-order Markov processes have in common concerning next word prediction?
Both use WordNet to model the probability of the next word
Both are unsupervised methods
Both provide the foundation to build a trigram language model
Neither makes assumptions about the probability of the next word
A marketing team creates a graph using a square for each data point, where the length of each side is set to the data value. The data values are 10 and 20. What is the lie factor of the graph?
1
2
3
6
You are analyzing written transcripts of focus groups conducted on product X. You approach is to use TF-IDF for your analysis. What combination of TF-IDF scores should you examine to ensure you only report on the most important terms?
High TF score and high DF score
High TF score and high IDF score
High TF score and low IDF score
Low TF score and low DF score
What does YARN provide over and above MapReduce?
Separate cluster and resource management
Parallelized processing
Serialized processing
Access to HDFS data
What is an intended application of the MapReduce framework?
Processing can be broken into smaller pieces
Processing a large number of small files
Processing in real time is required
Processing a small subset of data
What do lemmatization and stemming have in common?
Use WordNet
Remove common words in a natural language
Reduce the high dimensionality in text
Use a set of heuristics
EMC Education Services
Data Scientist, Advanced Analytics Specialist Version 1.0 (EMCDS)
Certification Description
Proven Professional Website
Proven Professional Community
Certification Overview
This certification is designed to build on the skills developed in the Associate level course in Data Science (Data Science & Big Data Analytics) and help aspiring Data
Scientists continue to evolve and expand their skill sets. The main growth areas include advanced analytical methods, Hadoop (and Pig, Hive, HBase), Social Network Analysis, Natural Language Processing, and Visualization methods. The development
of these skills and the use of these methods provide the data scientist the ability to identify and communicate conclusions and recommendations in order to solve business challenges across many domains.
Certification Requirements
To complete the requirements for this certification you must:
Achieve the following Associate level certification:
Data Science Associate (EMCDSA)
Pass the following Specialist exam on or after May 1, 2015:
E20-065 Advanced Analytics Specialist Exam for Data Scientists
Note: These details reflect certification requirements as of May 1, 2015.
The Proven Professional Program periodically updates certification requirements.
*Please check the Proven Professional CertTracker website regularly for the latest information and for other options to meet the Associate level requirement.
EMC Education Services
E20-065 Advanced Analytics Specialist Exam for Data Scientists
Exam Description
Overview
This exam is a qualifying exam for the Data Scientist, Advanced Analytics Specialist certification.
This exam focuses on MapReduce, the Hadoop Ecosystem, NoSQL, Natural Language Processing, Social Network Analysis, Data Science Theory and Methods, and Data Visualization.
Duration
90 Minutes
(60 Questions)
Pass Score: 63
EMC provides free practice tests to assess your knowledge in preparation for the exam. Practice tests allow you to become familiar with the topics and question types you will
find on the proctored exam. Your results on a practice test offer one indication of how prepared you are for the proctored exam and can highlight topics on which you need to study and train further. A passing score on the practice test does not guarantee a passing score on the certification exam.
Exam Topics
Topics likely to be covered on this exam include:
MapReduce (15%)
MapReduce framework and its implementation in Hadoop
Practice Test
Exam - E20-065
Hadoop Distributed File System (HDFS) Yet Another Resource Negotiator (YARN)
Hadoop Ecosystem and NoSQL (15%)
Pig
Hive
NoSQL
HBase
Spark
Natural Language Processing (NLP) (20%)
NLP and the four main categories of ambiguity
TextPreprocessing Language Modeling
Social Network Analysis (SNA) (23%)
SNA and Graph Theory
Communities
Network Problems and SNA Tools
Data Science Theory and Methods (15%)
Simulation
Random Forests
Multinomial Logistic Regression and Maximum Entropy
Data Visualization (12%)
Perception and Visualization Visualization of Multivariate Data
The percentages after each topic above reflects the approximate distribution of the total question set across the exam.
Recommended Training
Course Title | Course Number | Mode | Available |
Advanced Methods in Data Science and Big Data Analytics | MR-1CP-ETAAMUSD | Instructor-Led | 4/6/15 |
Advanced Methods in Data Science and Big Data Analytics - Video ILT | MR-1TP-ETAAMUSD-966 | Video ILT-Stream | 5/11/16 |
Advanced Methods in Data Science and Big Data Analytics - Online ILT | MR-1LP-ETAAMUSD | Online ILT | 4/19/16 |
The following curriculum is recommended for candidates preparing to take this exam. Please complete one of the following courses
Note: These exam description details reflect contents as of May 1, 2015.
The Proven Professional Program periodically updates exams to reflect technical currency and relevance. Please check the Proven Professional website regularly for the latest information.
EMC2, EMC, EMC Proven, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. VMware is a registered trademark or trademark of VMware, Inc., in the United States and other jurisdictions. © Copyright 2015 EMC Corporation. All rights reserved. Published in the USA. 5/15 Exam Description
EMC believes the information in this document is accurate as of its publication date. The information is subject to change without notice.