Menu

[Solved] Pagerank Develop Spark Program Calculating Page Rank Use Sctextfile Inputtxt Read Input Li Q37163478

PageRank: develop a Spark program for calculating the pagerank.

– Use sc.textFile(“input.txt”) to read the input.

* Each line of the input.txt file describes a link in the graph,i.e., source_node space(s) destination_node. For example, “1 2”represents a link from node_1 to node_2. Use the provided input.txtas a reference.

– Use rdd.saveAsTextFile(“output”) to save the output.

* You may save the result of page rank to multiple files.

* In each file, each line records the page rank mass of a node.The records in a file need to be in an ascending order of thenodes.

• Requirements:

– The Spark program needs to take one argument to specify howmany iterations the algorithm needs to go through. The defaultnumber of iterations is 10, which is specified inside theprogram.

– The total pagerank mass is 1.

– The program needs to deal with the dangling nodes, which haveno outgoing links, and the random jump as shown in Equation(1).

* α: random jump factor, use 0.1 in this programmingassignment.

* |G|: total number of nodes in the graph.

* m: the missing page rank mass due to dangling nodes.

– Initialize the page rank mass of each node to 1/ |G| .

Equation (1): p' = alpha (frac{1}{|G|}) + (1-alpha ) (frac{m}{|G|} + p)

input.txt:

1 21 31 42 13 14 11 5

Expert Answer


Answer to PageRank: develop a Spark program for calculating the page rank. – Use sc.textFile(“input.txt”) to read the input. * E… . . .

OR


Leave a Reply

Your email address will not be published. Required fields are marked *