Medalla Challenge

barb@BuildingBlocksTechnologies.io
Oct 20, 2020
12 min read

Updated: Jan 23, 2022

A Cluster Analysis of Medalla Testnet

Validator Distributions

October 20, 2020

Hi Friends! In this blog post we are going to summarize findings related to a numerical

analysis of the Validator distributions associated with the ETH 2.0 Medalla testnet.

This study was undertaken in response to a challenge set forth by the Ethereum

Foundation. The challenge was to isolate one or more aspects of the mounds of data

associated with this testnet to provide insights to make sense of it all. Here is a link to the

challenge solicitation.

A wishlist of aspects of the data to study were supplied by the challenge organizers. On the wishlist was a cluster analysis of the validators. It was this aspect of the network and data that we chose to tackle, and which will be described in this blog post.

The methodology executed consisted of the following:

Gather data associated with the execution of Medalla Beacon Contracts from the Goerli network
Gather data associated with the parent that executed the Medalla Beacon Contract
Execute an Initial Validator distribution Analysis
Cluster to form sets of related validators; Assign “level of sophistication” to each validator
Report statistics on cluster composition and validator classification
Gather data on the network participation of each validator; Analyze overall performance each category of validators
Draw final conclusions

Each of the above steps in the methodology will be described in an individual paragraph.

All code used in this challenge was written from scratch and can be found here:

https://github.com/BuildingBlocksTechnologiesLLC/Eth2.0-Validator-Clustering

Methodology Step #1:

Gather Medalla Beacon Contract Data

Overview: In order to become a validator for the Medalla testnet, it is necessary to

deposit 32 (test) Ether into a deposit contract on the Goerli network. Gathering and

organizing this data, therefore, gives the superset of all validators that may be participating

in testnet validator activities.

Approach: Use a python utility to walk through the pages of the following URL. This URL

represents the Goerli Etherscan Explorer pointer toward the Medalla Beacon Contract.

https://goerli.etherscan.io/txs?a=0x07b39F4fDE4A38bACe212b546dAc87C58DfE3fDC&p = <n>, where <n> is the page number, which as of this writing is in the range [1,1508]

In addition to gathering the data related to the Medalla Beacon Contract execution, we take

one step out to the Beaconscan (a Medalla network explorer), to pick up the index assigned

by the Medalla network to the Validator. The URL where we get the Index is as follows:

https://beaconscan.com/validator/<ValidatorPublicKey>

Lastly we convert the Base16 validator public key to the Base64 representation, which

gives us 3 ways to identify the validator: Base16 public key, Base64 public key and

Medalla network assigned index. The 3 different representations are in preparation for

subsequent data gathering where some interfaces, such as the Prysm client API, might

require either the index or the Base64, as opposed to the Base16.

Data: At the end of this step, we have a CSV file that contains the following fields:

headers = ["index", "timeStamp", "validatorPublicKeyB16", "fromAddress", "block",

"txnHash", "status", "toAddress", "value", "txnFee", "gasLimit", "gasUsedByTxn",

"gasPrice", "nonce", "validatorPublicKeyB64"]

Notes: There is a time delay between when a contract is executed and when there is an

index assigned by the Medalla network. There is also a delay between when a validator is

assigned an index and when the validator becomes active. For the purposes of this study,

this is a don’t care – there is plenty of data that has been assigned and has been active for

a good period of time. For the analysis, we choose an arbitrary date of 9/24/2020 as the

cutoff for validators that we pull from the Goerli Beacon Contract execution.

The name of the python utility for this step is getBeaconContracts.py and the data is

written to Data/beaconContractTransactions.csv (See introduction for the root location of

the repositories...)

Methodology Step #2:

Gather Data Related to the Parent of the Beacon Contract Execution

Overview: One of the goals of this study is to Cluster or “bucket” groups of validators into

related groups and to categorize each of the individual validators as to that of “hobbyists”,

“Staking services”, etc. A natural starting point for this activity is to look at where the

validator originated from – multiple validators that originate from the same “FromAddress”

(parent) are clearly related. Other relationships/associations are not quite as straight

forward, but as we will show, clustering and scoring techniques can be employed to

achieve distributions with high probabilities of certainty.

Approach: Use a python utility to walk through the pages of the transactions associated

with the parent or “FromAddress”. Here is the base URL: https://goerli.etherscan.io/txs?a =<Address>

So, For example:

https://goerli.etherscan.io/txs?a=0xd9a5179f091d85051d3c982785efd1455cec8699

This will give all of the Goerli transactions that the parent has participated in. It will show

funds coming and going from the parent. It will show the MedallaBeaconContract

execution, etc.

Data: For each transaction, we scrape the following fields: TxnHash, Block, Age, From, To,

Value, TxnFee At this point, the usefulness of this data but might not be clear, but the

value will be shown in the next stage, where we use this information in our clustering

process.

Notes: The data that is output from the Python utility is slightly garbled. If loaded into

Excel, for example, not all of the columns line up. However, this anomaly is cleaned by a

process further down the chain. Additionally, this utility doesn’t recognize or account for

duplicates. Again, this is taken care of downstream for this analysis.

The name of the python utility for this step is getParentTransactions.py .

The input data is the output from the previous step, which is Data/

beaconContractTransactions.csv. The output data is a file for each of the parents and is put

in the directory Data/ParentTransactions/<parent>

Methodology Step #3:

Initial Validator Distribution Analysis

Overview: In this step of the methodology, we look at some initial distributions of the

validators as a pre-cursor to the formal clustering process.

Approach: The approach for this step is hierarchical, consisting of the following.

Initial validator distribution analysis based on parent (Java Utility, Validators.java)

In this step, the data that was scraped based on the Beacon contract execution is

sliced and diced. The following views of the data are generated:

Overall number of validators that successfully completed the contract (71287 for the time period covered in this analysis)
Mapping of day → total contracts, # singleton contracts, # multiple contracts, %singletons. “Singletons” are validators that appear to have no connection to other validators. This view of the data was meant to show an initial view of the number of independent validators vs. those validators which were spun up in a group, on adaily basis. So for example: The following data shows that on July 23, there were a total of 1028 validators spun up, but the total number of addresses that created those 1028 validators was only 5, with almost 99% of them coming from 3 addresses.

2020-07-24 1028 Validators total

3 Addresses had MultipleValidators

2 Addresses had a Single Validator

Input: CSV file from scraping the Beacon Contracts (Data/beaconContractTransactions.csv)

Output: Flat file summary with the various mappings (Data/ValidatorDistributions/singletons.txt)

Mapping of day → FromAddress:NumberThatDay This view of the data shows the

number of validators that were spun up from each independent address on a given

day. In keeping with the example above from July 23, if we look at that day, we see :

2020-07-23 0x388ea662ef2c223ec0b047d41bf3c0f362142ad5 1023

2020-07-23 0xf9a0a0706997fcec8389f152c15d540a8b7a8507 1

2020-07-23 0x11c100b0173d8cd7e73f6e1808e5a811fc5a93e7 2

2020-07-23 0x561548b4955dcc35d910525052dfe3ccc434d7cb 1

2020-07-23 0x162fa0abecf596e07fb8ef128c15cd589f7b473a 1

The above shows that 1023 Validators out of 1028 originated from the same source. This could be a first indication that any validators connect to 0x388ea662ef2c223ec0b047d41bf3c0f362142ad5 are more than just hobby’sts, as there appears to be over 1000 related validators.

See Notes below and stick with me as we unravel this data and the stories it might

be able to tell! Output data is in Data/ValidatorDistributions/fromAddressCounts.txt

An ordered mapping of #validators → address So, for example:

256 0x38eae6c9e85bddfc1a9dd60853f30530fe5154bb 2020-07-24

256 0x03609041aaaa3ec1c2dd908e2c9f6abc0ac01a53 2020-07-24

256 0xf9acc04133f24194af4592ca1d7779695bd49524 2020-07-24

256 0xe5429311a89b284f6db006207952e04aba036db2 2020-07-24

256 0x8e8afd92f90201962718da22c944fa60e373676f 2020-07-30

256 0x6c7dedafc400ebf07f23bcfca4ec931a0a278b89 2020-08-22

In the above data, we show that there were 6 parent addresses that on various days spun

up batches of 256 validators. Which, if any, of these batches are related to each

other? Hang tight! I think our data can conclusively show which groups are related

to form a bigger group, again thus differentiating these particular validators from

that of a “hobbyist”. Output data is Data/ValidatorDistributions/orderedCounts.txt

Summarize parent transactions (Java Utility, processParentTransactions.java)

Input: Files generated by scraping the transactions of the parent which were

generated in the previous step: Data/ParentTransactions/<address>

Output: CSV file that gives the following columns: Parent Address, TotalNumberOfTransactions, NumberOfInTransactions,

TotalAmountOfInFunds, HighestAmtOfInFunds, AddressWhereHighestCameFrom,

DateOfHighest, OtherTestnetContracts (Altona, PrysmOnyx, etc).

Script to run this step is : processParentTransactions.sh

“Summarizing Parent Transactions” deserves a bit more explanation. The motivation behind

this aspect of the data analysis is to evaluate the parents’ disposition, which will help to

categorize the validator itself. Example: Has the parent registered a validator contract on

a previous testnet? Has the parent received large sums of test Ether from some source OR

is its only source of funding one of the faucets? Etc. This information is used in the next

step of the methodology where we “Summarize the major characteristics of each

component of a validator tuple”.

Methodology Step #4:

Cluster to form sets of related Validators

Assign “level of sophistication” to each Validator

Overview: In this step of the methodology, we do the heavy lifting of evaluating patterns

to cluster and categorize the validators. We assign levels of sophistication based on

characteristics such as: Is the validator a “singleton” – ie. does it appear to be just a single

validator with no connection to any other validators? OR, is the validator clearly a part of a

group that was spun up by the same parent? More complexly, is this validator a part of a

group in which the group itself appears to be part of a bigger group? Note that we move

away from the terminology of the introductory paragraph of “hobby’sts, staking as a

service, exchanges, etc” and move into generic “levels of sophistication” as defined below.

Approach: As mentioned above, the heavy lifting is done in this step, and as such this

paragraph will be one of the longest of the blog post. The code to accomplish the

clustering was written from scratch in java and can be found in the Git repository

referenced in the introduction at Cluster.java.

In this blog post, we attempt to summarize the approach and algorithm in enough detail to provide a high level understanding without drowning the reader in too much detail. Below are the high level steps to the clustering approach.

The fundamental approach is the following:

Consider a tuple:

ValidatorPublicKey <-- Parent <-- FundingSource (WhoDepositedToParent)

Summarize the major characteristics of each component of the tuple.

FundingSource:

Does it appear to be a faucet, or have faucet like characteristics?
How many validator parents has this source sent funds to?
Has the funding source itself executed 1 or more Medalla deposit contracts?

Parent:

How many Beacon deposit contracts has this parent executed
Has this parent participated in previous testnets (PrysmOnyx, OldPrysm, etc)
What is the maximum input funds this parent has received?
What other parents has this parent sent funds to?

ValidatorPublicKey:

How does this validator sit within the web of relationships formed by the other 2 components of the tuple?

From a development standpoint, the approach to achieve a final answer is iterative,

consisting of data generation, manual review, application of rule sets – rinse, lather, repeat.

The final order of processing to generate the data for blog post is as follows:

• Generate a parent to validator map. This groups all of the validators whose beacon

contract was executed by a single entity.

• Merge parents into groups based on evidence of interactions between the parents

• Generate a funding to Parents map. Apply patterns to produce clusters.

• Group and apply patterns on “max input funding”•

• Review which validators are left after the above

* Last round to pick up singletons and those that didn’t fit into any other category

* Write to disk several flavors of output that describe the final cluster result

A few examples will be given to show how relationships were formed and then the

relationships were turned into clusters.

Example 1: Funds sent from one parent to another. Validators are clustered.

FundingSource : 0x039f528b7022572bae899ef7cf619b46951ffe10 -->

Parent : 0xaf26f7c6bf453e2078f08953e4b28004a2c1e209- → 1 Validator

| Sends Funds

\ /

0x9e64b47bbdb9c1f7b599f11987b84c416c0c4110 --> 10 Validators

Results in a cluster with 11 validators.

Example 2: To revisit the example from Processing Step #3, we saw the following

association as a result of mapping the data based on the number of validators:

256 0x38eae6c9e85bddfc1a9dd60853f30530fe5154bb 2020-07-24

256 0x03609041aaaa3ec1c2dd908e2c9f6abc0ac01a53 2020-07-24

256 0xf9acc04133f24194af4592ca1d7779695bd49524 2020-07-24

256 0xe5429311a89b284f6db006207952e04aba036db2 2020-07-24

256 0x8e8afd92f90201962718da22c944fa60e373676f 2020-07-30

256 0x6c7dedafc400ebf07f23bcfca4ec931a0a278b89 2020-08-22

And in fact, the clustering was able to put the 4 sets of 256 from July 24'th together, for a

total of 1024 validators in that cluster and an overall sophistication rating of LEVEL_3

As part of the clustering process, a rule set is applied to assign a classification. Four levels

of classification were used and are defined as follows:

LEVEL_0 : Pure Hobby'st (a few validators, got funds from faucet)
LEVEL_1 : Hobby'st who appears a bit more sophisticated than LEVEL_0
LEVEL_2 : Maybe Very sophisticate Hobby'st OR small level Institutional
LEVEL_3 : Highest level of sophistication, maybe Institutional

Data: The input data for the clustering step comes from the previous steps in this

methodology. Precisely, the beacon contract transactions, the parent transaction

summaries and some additional data that is boiled down to show transactions where one

validator parent has sent funds to another parent validator contract.

Several flavors of intermediate and final output are generated. All files are in Data/Clusters/

ParentToValidatorMap.txt: parent → list of validators
FundingAddressOfMax.txt: fundingSource → list of validators
Levels.csv : Flat text lines that give every validator and some information about the cluster they were put in.
Clusters.csv : final clustered output, in a “free form” format. There are sections in this file for each of hierachial levels of clustering. Also information about the“evidence” used to make a decision is in this file. IF an individual reading this blog post had, themselves, run a Medalla Testnet validator, you can search the final Clusters.csv file to see where your validator(s) fell within our clustering process!

Notes: This step encompasses a lot. To summarize, the objective of this step is to cluster

each validator into a group of “related” validators, where the relationship may be based on

any of several characteristics discussed above. Additionally, each cluster (and thus, each

validator) are assigned a level of sophistication.

Methodology Step #5:

Report Statistics on Cluster Formation and Validator Classification

Overview: In this step we present our statistics!

Approach: The Cluster.java program prints out a final summary.

Data: The final summary data is in Data/Cluster/FinalClusterSummary.csv and all of the

detailed data is on Data/Cluster/Clusters.csv The basic stats are in Data/Cluster/ClusterStats.csv

Notes: For all of the work that it took to get to this point, the final statistics given are very

basic and one may claim not very interesting. However, we think it is interesting that the

clustering analysis shows that 75+% of the network is controlled by less than 3% of the

overall participants. This finding may contradict the statement on the Medalla Challenge

page which notes “Medalla as a testnet has mostly attracted hobbyists”. That statement is

true with respect to the pure number of participants but it might not be true with respect to

overall network influence. Additionally, assuming the final numbers are correct, if the 88

parents that control 75% of the network decided to stop participating, that would be

enough to put the network into a state where it could no longer achieve finality.

In general, the following findings are as expected:

The lower levels of sophistication feature higher participation but lower volume
The highest level of sophistication has the fewest overall participants, but makes up the largest amount of the network

And, below is an example of the results of the clustering. This table shows small snippets of the Levels being assigned, the number of validators in the cluster, the number of parents, an

indication of whether clusters were merged across parents and other anecdotal evidence

applied in the decision making process. The highlights are just to point out the assigning of some of the characteristics (ie. just to show a few examles here and there...)

Methodology Step #6:

Gather Network Participation Data for Each Validator

Overview: Now that we have clusters and classifications for all of the validators, we would

like to assess the performance characteristics of each classification of validators, in an

attempt to draw useful conclusions for each classification.

Approach: At this point in the game, there were several ways in which we could go. A

few good Samaritans in the ETHStaker discord chat posted access to dumps of data, which

could have been used. However, since several people participating in the challenge were already exploiting that data, it was thought that we’d try to do something different.

A first attempt was made to get validator performance data from a locally running

BeaconChain node via the Prysm API. It was known that that APIs were in flux, but it was

thought “couldn’t hurt to try!”.

The code for hitting the API endpoint is in getApiData.py and hitEndPoint.py, which are both

python scripts, located in the same Git Repo as all of the other code, as referenced in the

introduction.

With limited success on getting enough interesting data natively from the endpoint, it was

decided to scrape some summary data from the Explorer. The scraper for this step is

again, a python utility, getValidatorSummary.py The summaries are post processed to

produce a CSV file. The post processing is done using a java utility,

processValidatorPerformance.java

Data: The final CSV summary is in the file validatorPerformanceSummary.csv In reviewing

the information that was garnered from this approach, it was clear that it would be difficult

to draw any conclusions on network participation/validator-performance without taking into

account the Medalla network roughtime event. Given this, additional data was gathered.

The additional data looked at Validators that were eligible at genesis, considering in

particular, balances at 3 epochs: before roughtime, a day or so after the network began

finalizing again, and 2 weeks past. This data came from hitting the API of our local node

and is in Data/genesisValidatorsBalances.txt. Note, the endppoint for getting a balance for

a given validator at a given Epoch was operational, unlike the enpoints that were tried for

getting validator performance.

Notes: The clock has just about expired on this challenge but we have been able to

process the data which shows balances for validators that were eligible at genesis at Epoch

2100, which was before the roughtime event. The file Data/balanceAtEpoch.csv has all of

the data (Epoch 2100 + 2 other Epochs picked to carve out the influence of the roughtime),

but we are only presenting results here on Epoch 2100.

The small table below shows the results. In the computations, we discounted validators

that were never active. As can be seen, the validators that were designated as the most

sophisticated by the clustering process indeed showed the best performance with the highest positive balance, using normalized balances. The raw difference and the %

difference are relative to the LEVEL_3 numbers. This can be interpreted as LEVEL_0

(hobby’sts) garnering about 20% less positive gain as compared with LEVEL_3. Note that

these computations looked only at positive balances. So, not only were validators that were

never active excluded, so were validators that had an overall negative balance difference.

Just to re-iterate, The following numbers are generated with these conditions:

Validators that were eligible at Genesis
Balances taken at epoch 2100
Validators that were never activated were discarded
Only compute for validators that had an overall positive net at epoch 2100
Balances normalized based on deposits and then centered to 0.0

Methodology Step #7:

Final Conclusions

A lot of data was collected and organized in this study. A clustering algorithm was

designed and code was written to execute. The focus of the study was taken from 2

suggestions provided by the challenge organizers: “Dig around on the explorers” and

“Validator Clustering”.

The major takeaway comes from the clustering process, where it is shown that 75% of

the network is being secured by < 3% of the participants. Additionally, it is shown that

from genesis to Epoch 2100, the more sophisticated validators garnered 20% more profits

than the less sophisticated.

With more time and effort, further analysis could study uptime, inclusion delays, etc now that the validators have been clustered and characterized.

In addition to the above, it is noted that this study focused a lot on relationships

amongst the validators. In the future, it might be interesting to develop a graph model to

interactively explore validator performance and relationships. (A graph model, such as one

that could be spun up in a Neo4J sandbox, inherently treats relationships as first class

citizens).

Lastly, friends, it is hoped that this blog post and the data may be interesting to

others, and may provide a starting point for further analysis on Medalla and ultimately

carry through to support monitoring and analysis on mainnet.

Please feel free to download this blog post to save your personal library of Favorites!

A Cluster Analysis of Medalla Testnet

Validator Distributions

October 20, 2020

Comments