Adding data trading to an agent-based model of the economy

May 15, 2016

It’s been a while since I gave an update about my attempt to build an agent-based model for the information economy. That’s partly because I got distracted crowdsourcing election candidate data and results for Democracy Club. It’s also partly because of this:

you think success is a straight line but it's actually a squiggle
Image from Demetri Martin’s “This is a Book”

You may recall that I constructed a basic agent-based economic model and added trade to it, based on Wilhite’s Bilateral Trade and ‘Small-World’ Networks. Then I did some sensitivity analysis on it to check that the assumptions that I’d made in coding it all up weren’t changing the outputs in any major way.

My next step was to add DATA to the mix.In the process I realised that the model wasn’t mirroring reality sufficiently for me to be drawing conclusions from it.

Creating an economic model that includes data

Adding data to the agent-based model is pretty straight forward. It’s just the same as FOOD. Each firm has some initialData which it can supplement either by trading or by producing (based on its dataPerStep productivity) to update its currentData running total. A Firm’s utility is calculated as DATA x FOOD x GOLD rather than just FOOD x GOLD in the original scenario.

In my initial model, the price for DATA is calculated in the same way as the price for FOOD. This isn’t fair, however, because unlike with FOOD, when DATA is sold the seller doesn’t lose any DATA. (The impact of this essential difference between data and physical goods is the thing that I wanted to explore in these models.)

If we take the same example as I gave when added trade to the basic model, but this time with DATA rather than FOOD, this is how the trade works between a Firm that starts off with 30 GOLD and 5 DATA and a Firm that has 10 GOLD and 15 DATA. The price is set, as it would be if they were trading FOOD, to 2 GOLD for each DATA. At that price, the exchange goes like this:

30 5 150 6.00 10 15 150 0.67
28 6 168 4.67 12 15 180 0.80
26 7 182 3.71 14 15 210 0.93
24 8 192 3.00 16 15 240 1.07
22 9 198 2.44 18 15 270 1.20
20 10 200 2.00 20 15 300 1.33
18 11 198 1.64 22 15 330 1.47

In the original example, with FOOD, both Firms gain from the transaction up until 10 GOLD have been traded for 5 FOOD, at which point both have 20 GOLD, 10 FOOD and a utility of 200, an increase of 50 each. If they keep trading from that point their utility start to go down.

When DATA is involved, however, the Firm that is selling DATA does much better out of the transaction. Every step of the trade increases its utility because it is always simply adding GOLD rather than reducing its stock of DATA. So while the buyer’s utility rises from 150 to 200 in the trade, the seller’s utility doubles from 150 to 300.

Figuring out what a “fair price” for DATA would actually be, within this model, is something I want to come back to, but I thought I’d run the model with the price set in the same way as the price for FOOD is set, to give a baseline.

An economic model that includes data increases trading

Including data in the agent-based model does make some obvious changes. The first thing that’s really apparent is that compared to the FOOD-GOLD model, a lot more trading goes on.

In the FOOD-GOLD model, each Firm initiates an average of 1.25 trades, with a range of 0 to 12. Over the 5 runs, there are a total of 3182 trades.

In the DATA-FOOD-GOLD model, each Firm initiates an average of 3.54 trades over the 20 ticks, with a range of 0 to 14, with a total of 8893 trades over the 5 runs. This is biased heavily towards DATA trading, with each firm averaging 2.61 DATA trades (ranging from 0 to 13) and 0.93 FOOD trades (ranging from 0 to 7). About 74% of the trades that go on in this model involve DATA.

I realised having done this that the increase in trading could just be because there’s an additional good to trade, namely DATA. So I created an alternative COAL-FOOD-GOLD model where there’s again the additional good, but one that operates exactly like FOOD.

In the COAL-FOOD-GOLD model, each firm initiates an average of 1.59 trades, with a range of 0 to 15 and a total of 3971 trades over the 5 runs. As you’d expect, these are pretty evenly split between COAL and FOOD: 49% of the trades involve COAL.

So the increase in trading is partly due to there being more goods to trade, but mostly because of the unique nature of DATA.

Price stabilitsation with data trading

The price stabilisation graphs for food and for data are below.

price stabilisation for food

price stabilisation for data

Both prices stabilise at around 1 GOLD over time, but the prices of FOOD are a lot more variable than those for DATA. My guess is that this is because there’s less trading of FOOD than there is trading for DATA.

A look at inequality

One of the things that I’m particularly keen to examine in this model is whether data being in the mix changes the inequality in the set of Firms in the economy.

One thing to examine here is the relationship between the initial utility of each Firm and the final utility of each firm. In this version of the model, the initial utility is randomised (rather than being based on how much the Firm can produce, or all being initially equal). You’d expect a small correlation between what you start with and what you end up with, but not a large one as 20 steps is plenty of time to trade or produce your way out of your starting position. Here are the correlations:

model correlation

So there’s some evidence there that organisations that when data is added to the mix, the starting condition of each Firm is more influential than it would otherwise be, but it’s still not a very strong correlation.

The other thing I looked at were Gini coefficients of the economy, which is a measure of how unequal a society is, with 0% being perfect equity and 100% perfect inequality (one person having all the wealth). (For reference, the UK’s Gini coefficient is about 34%.)

model initial Gini coefficient final Gini coefficient

The Gini coefficients are roughly the same. But the thing about this result that makes me question whether the model is accurate is the fact that the Gini coefficients are decreasing from the initial state to the final state. This isn’t the case in the UK or the US, for example, where inequality is growing, and if you look at the global Gini index you’ll see it’s been increasing over time.

If the Gini coefficient in the model is decreasing, that’s a sign that the model isn’t properly reflecting inequalities that arise in real economies. That would probably be fine if I didn’t want explicitly to study inequalities. Given I do, it feels like I have to refine the model a bit more to make it better reflect reality so that we can draw conclusions from it.

Next steps

First, I need to add a mechanism to the model to measure the Gini coefficient over time. I’m currently only measuring the Gini coefficient at the very beginning (where wealth distribution is randomised) and at the very end (after 20 ticks of trading). It might be that the Gini coefficient goes down rapidly during the price stabilisation phase and then starts increasing, and therefore the model is accurately reflecting the increase of the Gini coefficient over time, once it gets going. I need to monitor it on each tick in order to work that out.

Then, if the Gini coefficient isn’t increasing, I need to add some mechanisms to the mix that are likely to increase inequality. Things that I’ve thought of are:

  • reducing FOOD and/or GOLD by a fixed amount each tick, to mirror the minimum expenses a Firm incurs simply for existing; if I do this I have to add the possibility of Firms failing (and new Firms being created) or going into debt
  • adding a mechanism that enables those Firms that have more GOLD to get more GOLD, for example by lending at interest to other Firms or by investing in increasing their own productivity
  • breaking up the economy into smaller sub-economies that only trade with each other, with some connecting Firms that can trade across those sub-economies; this was a variant in Wilhite’s original paper but I don’t know if it had an effect on inequality

If you have any other ideas, let me know.

Trying it out

As before, if you want to try out where I’ve got to so far, I’ve committed the source code to Github and there are instructions there about how to run it. Any comments/PRs will be gratefully received. The code is quite messy now and could do with a refactor.

I’ve also put all the raw data generated from the runs described in this post in spreadsheets. These are:

Feel free to copy and create your own analyses if you don’t want to run the models yourself.

Sensitivity analysis on an agent-based economic model

Apr 1, 2016

Previously, in my quest to build an agent-based model for the information economy, I constructed a basic model and added trade to it, based on Wilhite’s Bilateral Trade and ‘Small-World’ Networks.

From doing that, we’ve seen that price stabilisation occurs over roughly the first 10 cycles, with about 38% of the 500 agents being pure producers, and about 5% only responding to trade requests from others.

There are a few parts of this model where I’ve made choices that might influence the outcome. To test these out, I want to do a sensitivity analysis to double-check that I’m not drawing unwarranted conclusions from single runs.

Setting up Repast to do multiple runs

Repast can be used to do batch runs of a particular model, spawning several instances with different starting conditions and therefore different end points.

Getting this working had a few false starts. Batch runs need to include code that stops the run after a set number of cycles. This code needs to be placed in the src/informationeconomy/context/SimBuilder.groovy file, which you don’t normally see when viewing the Package Explorer in Eclipse. Getting the simulation to stop after 20 iterations requires a simple line:

public class SimBuilder implements ContextBuilder {
	public Context build(Context context) {

With that in place, the Batch Run Configuration tool enables you to run any number of concurrent “worlds”. I ran five with different random seeds. The following price stabilisation graph shows that they all reach price stabilisation after about eight iterations (pale lines are individual runs; stronger lines are the average over these runs):

graph showing price stabilisation as max and min prices converge on a mean over around 8 ticks

With 20 ticks per run, about 43% spend all their time producing goods and 57% trade in some way. About 6% never initiate trade themselves but just respond to offers from other agents.

Initial FOOD and GOLD

The first area where I want to carry out some sensitivity analysis is in the initial amount of FOOD and GOLD that each agent has. In the runs described above, each agent starts with the same amount of FOOD and GOLD that they can make in a turn. There are two other options that I want to test out: one where every agent starts with one FOOD and one GOLD, and one where each agent starts with a random amount of FOOD and GOLD (between 1 and 30).

With all Firms initially having a random amount of FOOD and GOLD, there are slightly fewer pure producers (38%) and more Firms that only accept trades (10%). Prices don’t start as high and follow a smoother path to a later stabilisation (around 14 steps in), as shown here:

graph showing price stabilisation as max and min prices converge on a mean over around 16 ticks

As we’d expect, there’s no relationship between initial utility and final utility when the Firms’ initial utility is unrelated to their ability to produce goods:

graph showing final utility based on initial utility

Starting with one FOOD and one GOLD leads to more trading, with only 30% pure producers and 17% of Firms only accepting trades. Prices start higher (after no trading in the first step) but settle down in the same way as with the other kinds of starting conditions.

graph showing price stabilisation as max and min prices converge on a mean over around 16 ticks

Given the smoothness of the price stabilisation curve when Firms start with random amounts of FOOD and GOLD, I will use this version of the code going forward.

Randomising FOOD or GOLD production

On each step, each Firm currently has to decide whether to produce FOOD, produce GOLD, or trade. The code that determines which they choose to do has some built-in biases:

if (utilityMakingFood > utilityMakingGold) {
	if (trade['utility'] > utilityMakingFood) {
		action = makeTrade(trade)
	} else {
		currentFood += foodPerStep
		action = [ type: 'make', good: 'food', amount: foodPerStep, utility: currentUtility() ]
} else if (trade['utility'] > utilityMakingGold) {
	action = makeTrade(trade)
} else {
	currentGold += goldPerStep
	action = [ type: 'make', good: 'gold', amount: goldPerStep, utility: currentUtility() ]

The Firm will only consider making FOOD if the utility of making it is greater than the utility of making GOLD. Similarly, it will only consider making a trade if the utility of trading is greater than producing either FOOD or GOLD. This should bias the Firms towards producing FOOD or GOLD, and specifically towards producing GOLD, all things being equal.

Under the initial configuration, where Firms begin with the amount of FOOD and GOLD that they can produce in a single step, 85% of Firms produce GOLD at some point, and 84% produce FOOD. Across the 5 runs, only 24 Firms produce only FOOD (never trading or producing GOLD), but even fewer produce only GOLD (2, over the 5 runs).

Under a randomised initial amount of FOOD and GOLD, 79% produce GOLD and 81% produce FOOD, with 41 only producing FOOD and 21 only producing GOLD over the 5 runs.

So I don’t think that the code is biasing the results towards the producing of GOLD, but it’s hard to tell whether it’s biasing away from trade. I’ve added a bit of randomness:

def randomlyTrue = random(10000) > 5000
if (utilityMakingFood > utilityMakingGold || (utilityMakingFood == utilityMakingGold && randomlyTrue)) {
	randomlyTrue = random(10000) > 5000
	if (trade['utility'] > utilityMakingFood || (trade['utility'] == utilityMakingFood && randomlyTrue)) {
		action = makeTrade(trade)
	} else {
		currentFood += foodPerStep
		action = [ type: 'make', good: 'food', amount: foodPerStep, utility: currentUtility() ]
} else if (trade['utility'] > utilityMakingGold || (trade['utility'] == utilityMakingGold && randomlyTrue)) {
	action = makeTrade(trade)
} else {
	currentGold += goldPerStep
	action = [ type: 'make', good: 'gold', amount: goldPerStep, utility: currentUtility() ]

As anticipated, this makes very little difference. Over the five runs, one more Firm produces FOOD than previously, one more never trades, one more only produces FOOD and five more only produces GOLD. There is a more significant increase in the percentage of Firms that only receive (but do not initiate) trade, rising from 257 (10%) to 273 (11%).

Price stabilisation occurs as before, though the graph does show a more regular oscillation in maximum price over the first few ticks, compared to the slightly smoother trajectory shown in the previous graphs.

graph showing price stabilisation as max and min prices converge on a mean over around 16 ticks

All in all, the model does not appear to be sensitive to the biases in the code that determine how Firms choose what to do on each step. I will keep the less biased code.

Next steps

Next, it’s time to introduce DATA to the mix. My goal for the initial experiment is simply to replace FOOD with DATA, use the same formula to work out the price for DATA, but introduce the crucial difference between FOOD and DATA, namely that when you trade DATA, you do not lose it. I want to see what happens to price stabilisation in this scenario, and look at the kinds of Firms that emerge.

Trying it out

As before, if you want to try out where I’ve got to so far, I’ve committed the source code to Github and there are instructions there about how to run it. Any comments/PRs will be gratefully received.

I’ve also put all the raw data generated from the runs described in this section in a spreadsheet which you’re welcome to copy and run your own analysis over.

Introducing trade to a basic agent-based economic model

Mar 11, 2016

My next step in building an agent-based model for the information economy is to add trade to the basic model.

The trade protocol is described in Wilhite’s Bilateral Trade and ‘Small-World’ Networks. First, you calculate the marginal rate of substitution for an agent: the amount of GOLD that they would need to exchange for a unit of FOOD in order to increase their utility. This can be calculated for each agent as:


or in code form:

def mrs() {
	currentGold / currentFood

If this is greater than 1 then the Firm would rather buy FOOD (give up GOLD for FOOD). If it’s less than 1 then the Firm would rather sell FOOD (give up FOOD for GOLD).

Working out a fair price

Now imagine you have two Firms, A and B. Firm A has 20 GOLD and 10 FOOD (a utility of 20 × 10 = 200, and a mrs of 20 / 10 = 2). Firm B has 10 GOLD and 20 FOOD (with a utility of 200 as well, but a mrs of 10 / 20 = 0.5). These two firms can beneficially trade with each other to maximise their utility. If Firm A buys 5 FOOD from Firm B for 5 GOLD, both Firms end up with 15 GOLD and 15 FOOD, a utility of 225 with a mrs of exactly 1. This is the best they can do: if Firm A buys 4 FOOD for 4 GOLD they both end up with a utility of 14 × 16 = 224, if Firm A buys 6 FOOD for 6 GOLD they also both end up with a utility of 16 × 14 = 224.

In another scenario, let’s say you have Firm A with 20 GOLD and 10 FOOD and Firm B with the same. In this case, both Firms want to buy FOOD — they both have a mrs of 20 / 10 = 2, above 1 — and there is no mutually beneficial trade.

Then there’s the case where Firm A has 30 GOLD and 5 FOOD (a utility of 150 and a mrs of 30 / 5 = 6) and Firm B has 10 GOLD and 15 FOOD (a utility of 150 and a mrs of 10 / 15 = 0.67). In this case, if Firm A bought 2 FOOD from Firm B for 2 GOLD, Firm A would have 28 GOLD and 7 FOOD (a utility of 196, an increase of 46). Firm B would have 12 GOLD and 13 FOOD (a utility of 156, an increase of just 6). In this case the price of 1 GOLD for 1 FOOD unfairly benefits Firm A more than Firm B.

Wilhite uses the following formula to calculate the acceptable price between two firms:


In the final scenario described above, the price would be:

price = 30 + 10 5 + 15 = 40 20 = 2

This means Firm A should pay 2 GOLD for each FOOD from Firm B. Let’s see how that price works out:

30 5 150 6.00 10 15 150 0.67
28 6 168 4.67 12 14 168 0.86
26 7 182 3.71 14 13 182 1.08
24 8 192 3.00 16 12 192 1.33
22 9 198 2.44 18 11 198 1.64
20 10 200 2.00 20 10 200 2.00
18 11 198 1.64 22 9 198 2.44

Each step of the trade is equitable, and both achieve a maximum utility when Firm A has bought 5 FOOD for 10 GOLD, at which point they both have the same amount of GOLD and FOOD.

Turning this into code

First a few utility functions. We already have one to work out the marginal rate of substitution. This one works out the price that the firm should use to deal with another firm:

def priceForTrade(firm) {
	(currentGold + firm.currentGold) / (currentFood + firm.currentFood)

Now some functions that work out how much FOOD or GOLD to trade:

def tryBuyingFood(firm) {
	def price = priceForTrade(firm)
	def foodToBuy = Math.floor((firm.currentFood - currentFood) / 2)
	def result = [
		firm: firm,
		price: price,
		food: currentFood + foodToBuy,
		gold: currentGold - foodToBuy * price,
		utility: 0
	result.put('utility', utility(result['food'], result['gold']))
	return result

def trySellingFood(firm) {
	def price = priceForTrade(firm)
	def foodToSell = Math.floor((currentFood - firm.currentFood) / 2)
	def result = [
		firm: firm,
		price: price,
		food: currentFood - foodToSell,
		gold: currentGold + foodToSell * price,
		utility: 0
	result.put('utility', utility(result['food'], result['gold']))
	return result

For this version of the project, each Firm is going to try to trade with every other firm. The code to work out the best possible trade looks like this:

// work out the mrs for the firm
def mrs = mrs()

// set up a variable to record the best trade found
def trade = [
	firm: null,
	price: 0,
	food: currentFood,
	gold: currentGold,
	utility: currentUtility()

// cycle through each of the firms to see whether a trade is worthwhile
def thisFirm = self()
firms().each {
	def result = null
	if (mrs >= 1 && it.mrs() < 1) {
		// more GOLD than FOOD, so buy FOOD
		result = thisFirm.tryBuyingFood(it)
	} else if (mrs < 1 && it.mrs() >= 1) {
		// more FOOD than GOLD, so sell FOOD
		result = thisFirm.trySellingFood(it)
	} else {
		result = trade
	// set the best trade to the result if it's a better trade than the best found so far
	if (result['firm'] == null) {
		trade = result
	} else if (result['utility'] > trade['utility']) {
		trade = result

The final piece of the puzzle is to work out what to do given knowledge about the best possible trade, the potential utility achieved by making FOOD and the potential utility achieved by making GOLD. The decision uses this code:

if (utilityMakingFood > utilityMakingGold) {
	if (trade['utility'] > utilityMakingFood) {
	} else {
		currentFood += foodPerStep
} else if (trade['utility'] > utilityMakingGold) {
} else {
	currentGold += goldPerStep

Where makeTrade() is defined as:

def makeTrade(trade) {
	trade['firm'].currentFood += currentFood - trade['food']
	trade['firm'].currentGold += currentGold - trade['gold']
	currentFood = trade['food']
	currentGold = trade['gold']

Adding monitoring

This code is all that’s needed to get the agents operating in a market. But to get useful data out of the model, we have to capture what’s going on for each agent. To do that, I’ve set up an actions property that holds an array of actions that the agent takes. These can be of three types: making FOOD, making GOLD and initating trade (I don’t capture the recipient of trade). I’ve also created an activity property that is very similar but includes times when the agent is the recipient of a trade rather than the initiator. They’re both defined as arrays:

def actions = []
def activity = []

with the makeTrade() function returning the relevant trade action and adding to the activity of the firm that’s being traded with:

def makeTrade(trade) {
	def action = [ type: 'trade', food: trade['food'] - currentFood, gold: trade['gold'] - currentGold, price: trade['price'], utility: trade['utility'] ]
	trade['firm'].currentFood += currentFood - trade['food']
	trade['firm'].currentGold += currentGold - trade['gold']
	trade['firm'].activity << [ type: 'receive-trade', food: currentFood - trade['food'], gold: currentGold - trade['gold'], price: trade['price'], utility: trade['firm'].currentUtility() ]
	currentFood = trade['food']
	currentGold = trade['gold']
	return action

and the code that decides which action to take adding the relevant action to the actions list:

def action = []
if (utilityMakingFood > utilityMakingGold) {
	if (trade['utility'] > utilityMakingFood) {
		action = makeTrade(trade)
	} else {
		currentFood += foodPerStep
		action = [ type: 'make', good: 'food', amount: foodPerStep, utility: currentUtility() ]
} else if (trade['utility'] > utilityMakingGold) {
	action = makeTrade(trade)
} else {
	currentGold += goldPerStep
	action = [ type: 'make', good: 'gold', amount: goldPerStep, utility: currentUtility() ]
actions << action
activity << action

Looking at the results

First achievement: this looks to work! In a single sample run of 30 ticks, some of the agents (about 38%) spend all their time producing goods, while others (about 62%) trade in some way. Of those that trade, about 5% never initiate trade themselves but just respond to offers from other agents. The price negotiated with each trade starts off quite uneven, but stabilises at around 1 FOOD for 1 GOLD:

graph showing price stabilisation as max and min prices converge on a mean over around 8 ticks

Because of the way the model is set up, with no consumption of either FOOD or GOLD once it’s created, all the Firms increase in utility over the run. We can plot the final utility of each firm against its initial utility like so:

graph showing correlation between initial utility of a Firm and its final utility

The expected utility of a firm, if it doesn’t participate in any trading, can be calculated as foodPerStep * 16 * goldPerStep * 16. This accounts for the strong diagonal line in the above graph: one dot for each Firm that is a pure producer. The dots above that line are the Firms that participate in trade, who manage to achieve a greater utility through trading than they would if they had just produced all the time. There are no dots below the line because (unlike in real life) the Firms are very sensible about only choosing actions that are going to increase their utility.

An example trading history of just the first activities for a Firm that does well out of trading looks like this:

activity FOOD GOLD utility
  24 2 48
accept offer to sell 11 FOOD for 6.6 GOLD 13 8.6 111.8
produce 24 FOOD 37 8.6 318.2
sell 12 FOOD for 18.13 GOLD 25 26.73 668.19
produce 24 FOOD 49 26.73 1309.65
produce 24 FOOD 73 26.73 1951.11
sell 19 FOOD for 23.93 GOLD 54 50.66 2735.73
produce 24 FOOD 78 50.66 3951.61
produce 24 FOOD 102 50.66 5167.49
produce 24 FOOD 126 50.66 6383.36
accept offer to sell 40 FOOD for 37.66 GOLD 86 88.32 7595.29

This Firm only ever produces FOOD. It trades eight times during the 30 ticks of the simulation, initating the trade itself half of the time and ends up having about 9 times as much utility by the end as you’d expect given its starting condition.

It’s easy to predict which Firms will do well within the simulation: the higher the ratio between the amount of FOOD a Firm can produce and the amount of GOLD they can produce, the better they do. For example, a Firm that can produce only 1 FOOD per tick, but 25 GOLD per tick will do a lot better than a Firm that can produce 5 FOOD and 5 GOLD per tick, despite starting with the same utility. Specialisation wins!

Next steps

There are a few parts of the model that I am not sure about. There are parts of the code that, all things being equal, favour production over trading and favour GOLD production over FOOD production. I’ve also started each Firm off with amounts of FOOD and GOLD that depend on how much they can create, rather than starting everyone off with 1 FOOD and 1 GOLD, or randomising how much they get at the start. I want to do a bit of sensitivity analysis to make sure the model is robust before I expand it to include DATA.

I also want to work out how to run the simulation multiple times so that I can aggregate the results and smooth out some of the jagged lines in the graph.

Trying it out

As before, if you want to try out where I’ve got to so far, I’ve committed the source code to Github and there are instructions there about how to run it. Any comments/PRs will be gratefully received.

Building a basic agent-based economic model in Repast

Feb 9, 2016

My previous post talked about building an agent-based model for the information economy.

Step one is to create the basic agent-based model described in Wilhite’s Bilateral Trade and ‘Small-World’ Networks in Repast Simphony. In fact I’m going to set myself an even smaller task for this post: setting up the Firm agents to produce FOOD and GOLD without letting them trade.

Let’s set up the agents first. Each agent has at any point in time an amount of FOOD and an amount of GOLD. These are set in the Firm.groovy code.

def currentFood = 0
def currentGold = 0

Calculating utility

From the FOOD and GOLD we can apparently calculate the utility of the agent using a rudimentary Cobb-Douglas function like so:

def utility(food, gold) {
	food * gold

Utility is an economic term and as I said I’m not an economist. According to the Wikipedia article, utility is about commodities, not agents, so I’m not sure that this is “the agent’s utility” as such. The Cobb-Douglas function is giving the utility of the commodity generated from the FOOD and GOLD. Since that commodity doesn’t go anywhere, I’m taking it that in this model the “current utility” is more like a measure of, in layman’s terms, “current wealth”. I’d love it if someone corrected my understanding here.

The Cobb-Douglas function used here, and in Wilhite’s paper, is one particular form of a more general function:

Y = A L β K α

(Go, MathML! I remember a time you’d have to have a plugin in the page to render that!)

In the version we’re using L (usually labour) is FOOD and K (usually capital) is GOLD. These seem like reasonable analogies, or you could argue it the other way round.

As you can see, in the version Wilhite uses, both α and β are 1. There are some consequences to this (“returns to scale are increasing”, whatever that means) but I’m going to follow Wilhite’s model for now.

Choosing what to do

At each step, the agents can choose whether to produce FOOD or to produce GOLD. The amount that they can produce of each at each step is random and needs to be set up when the agent is first created. So I need to define some variables on the agents to hold the values:

def foodPerStep = 0
def goldPerStep = 0

Then when the agents are created I need to set these randomly (between 1 and 30 is what Wilhite uses). This is in the UserObserver.groovy code as the setup function run at the beginning of the simulation:

def setup(){
	setDefaultShape(Firm, "circle")
		foodPerStep = random(29) + 1
		goldPerStep = random(29) + 1

You’ll notice that I decided to make the Firms circles in the visualisation and distribute them randomly. Good enough for now. And I’m making 500 Firms: that’s the number Wilhite used.

On each step, each Firm will decide whether to produce FOOD or produce GOLD, based on which increases their utility the most. The choice is in this code:

def step() {
	def utilityMakingFood = utility(currentFood + foodPerStep, currentGold)
	def utilityMakingGold = utility(currentFood, currentGold + goldPerStep)
	if (utilityMakingFood > utilityMakingGold) {
		currentFood += foodPerStep
	} else {
		currentGold += goldPerStep

Note that if it doesn’t make a difference whether FOOD or GOLD gets produced, it’ll produce GOLD.

To make that work, the UserMonitor.groovy code needs to call the Firm.step() method for each Firm on each step:

def go() {


When you run this code in Repast it generates a rather pretty picture of lots of multi-coloured circles (completely meaningless of course):

Randomly distributed firms in Repast

Repast lets you monitor individual agents. So I can see that the first agent in the simulation has a foodPerStep of 12 and a goldPerStep of 1. Here’s how its food and gold changes over the ticks of the simulation:

step FOOD GOLD utility
0 0 0 0
1 0 1 0
2 12 1 12
3 12 2 24
4 24 2 48
5 24 3 72
6 36 3 108

So things are working as expected (if not in any very interesting way, since the agents aren’t currently interacting with each other).

Thinking ahead

The main next step is to get trade working between the agents, which will hopefully make things more interesting. There are several areas that seem extremely simplified:

  • I’m not sure that α and β in the Cobb-Douglas function should both be 1. I think perhaps they should sum to 1 (eg be 0.75 and 0.25). Maybe different agents should have slightly different values for those exponents.

  • If the projected benefit of making FOOD or GOLD are the same, perhaps the agent should randomly choose between them rather than always producing GOLD. That would introduce a bit more randomness into the simulation.

  • Currently, the amount of FOOD and GOLD that the agent can produce at each step is distributed evenly across the agents: roughly the same number of agents will be able to produce between 1-5 FOOD as between 10-15 FOOD as between 25-30 FOOD. There are various other distributions that could be used: perhaps a exponential distribution to reflect the fact that there are many more small companies than large ones.

  • Shouldn’t the ability of an organisation to produce be related to its current holdings? I would have thought that the production ability of a firm should be proportional to its existing FOOD and GOLD rather than fixed throughout its life.

  • Shouldn’t there be some consumption of FOOD and GOLD that leads to firms going bankrupt if they don’t have minimum levels? Should there be some mechanism for new entrants to appear? If we’re looking at an area where there is innovation, these factors seem important.

I keep reminding myself that examining what happens when these tweaks are added is part of the experimentation that the model enables. The main goal at this stage is just to replicate Wilhite.

The one piece where I will need to make a decision is in how to fit DATA’s role into the Cobb-Doublas function to represent the value of information or knowledge. I’d value some guidance on this. The two options that I can think of are:

  • DATA is A. A is supposed to be “Total factor productivity”, representative of technological growth and efficiency.

  • DATA adjusts α and/or β. These are supposed to be the “output elasticities” of labour and capital, ie increasing the utility from the same amount of FOOD and/or GOLD. This could be rationalised as data providing the ability to get more output from the same inputs.

Using DATA in either of these ways will change the way in which utility is measured but scale in different ways. I’m inclined to use the simplest (DATA is A) as a starting point.

Trying it out

If you want to try out where I’ve got to so far, I’ve committed the source code to Github and there are instructions there about how to run it. Any comments/PRs will be gratefully received.

Agent-Based Model of the Information Economy: Initial Thoughts

Feb 9, 2016

There are several studies on the impact of open data, ranging from specific case studies through to macroeconomic studies that estimate the overall impact of greater availability of information.

Case studies are problematic because they tend to look only at effects around specific datasets or companies, and it’s hard to know how or whether these scale up to the economy as a whole. From what I’ve seen of the macroeconomic studies, on the other hand, they tend to be based on estimates created by multiplying big numbers together and don’t reveal who the winners and losers might be in a more open information-based economy.

Is it the case that opening data simply increases the gap between the information haves and have-nots, and that leads to wider economic inequality, or does everyone benefit when information is more widely available? Are there tipping points of availability at which we start realising the benefits of open data? What is the role of government in encouraging data to be more widely available and more widely used? To what extent should government invest in data infrastructure as a public good? How can local or specialist cooperatives pool resources to maintain data?

These are the kinds of questions we have answer in order to help frame public policy. While I can come up with and rationalise answers to those questions, I would prefer that they were based on something slightly more rigorous than a persuasive argument.

So I have been thinking recently about applying Agent-Based Modelling (ABM) or more specifically Agent-based Computational Economics (ACE) to the information economy. These seem to be promising techniques to get under the skin of the kind of impact that data might have on the variety of actors that there are within the economy, and to understand how public policy might affect that impact.

Now, I am not an economist, I do not have any background in agent-based modelling and it’s been a long time since I could justifiably call myself an artificial intelligence researcher. I’d like to learn about this field, and need to in order to pursue the goals described above. I hope other people will forgive my ignorance and mis-steps, and behave as this XKCD encourages:

Model design

I’ll share where my thinking is at so far.

The useful Tutorial on agent-based modelling and simulation contains a number of questions to help modellers focus on what they want to achieve:

What specific problem should be solved by the model? What specific questions should the model answer? What value-added would agent-based modelling bring to the problem that other modelling approaches cannot bring?

I’ve described above the kind of questions that I have in mind. Trying to be more specific, I’d like to explore:

  • What different groups or clusters of agents emerge in an information economy? Via an undergraduate dissertation by Gemma Hagen, I’ve found a paper by Allen Wilhite that talks about the emergence of specialisation in production or in trade. The Deloitte report on PSI talks about information holders, infomediaries, consumers and enablers: is there a model in which these roles emerge?
  • What difference does it make to overall productivity in an economy when there is a culture of data being open, available for anyone to access, use and share, as opposed to paid-for, available only to researchers or for non-commercial use, or completely closed?
  • How do share-alike licences propagate within an information economy? I would like to be able to work out whether we would achieve higher productivity by promoting share-alike licensing rather than public domain or attribution-based licensing.

An agent-based approach enables the study of emergent roles and enables us to tweak parameters to explore the difference that policy positions, capability building and emerging technologies might provide.

What should the agents be in the model? Who are the decision makers in the system? What are the entities that have behaviours? What data on agents are simply descriptive (static attributes)? What agent attributes would be calculated endogenously by the model and updated in the agents (dynamic attributes)?

Models of the circular flow of income contain additional agents. A two-sector model includes Households and Firms; a three sector model introduces Government. Given that I am particularly interested in the role of government and information held or gathered by the public sector, I think the model should eventually include Government, but I’m tempted to keep it really simple and only consider Firms in the first instance. Given at the moment I’m not so interested in the role of citizens, I think it shouldn’t include Households but instead abstract away their contribution.

In the Wilhite model there is only one type of agent (a Firm) but different Firms have different (randomly assigned) abilities to produce Goods of two types which you might characterise as FOOD and GOLD, which means that they naturally fall into different categories as producers or (if they’re not very good at production) traders.

So to start with, to keep it simple, I think the model should assign each Firm agent with a variable capacity to create DATA, FOOD and GOLD. We can imagine that agents that can create a lot of GOLD are services companies; those that can create a lot of FOOD are factories; those that can create a lot of DATA are technology companies.

Each agent will keep track of their DATA, FOOD and GOLD. To simulate the role of DATA within a Firm, I think that the amount of DATA held by the Firm should influence its productivity, namely its ability to create more FOOD or GOLD. Calculated from the amount of FOOD and GOLD held by the Firm is its wealth, which the Firm will attempt to maximise.

What is the agents’ environment? How do the agents interact with the environment? Is an agent’s mobility through space an important consideration?

I’m not sure whether to include proximity between Firms within the model. To keep it simple, and especially to mirror the fact that when it comes to exchanges of data, distance doesn’t matter, I’m tempted to simply have the agents co-existing in a soup.

What agent behaviours are of interest? What decisions do the agents make? What behaviours are being acted upon? What actions are being taken by the agents?

Mirroring the Wilhite model, I plan to let the agents choose between the following actions:

  • produce DATA, FOOD or GOLD
  • trade DATA, FOOD or GOLD with other Firms

In the initial model, they should choose between these actions based on which is going to increase their wealth the most. To ensure that agents ever want to trade for DATA, given that it doesn’t contribute directly to their wealth, they will need to have at least some predictive ability: to look forward to what they will be able to do if they have more DATA.

The other thing that makes DATA distinctive is that it is non-rivalrous. When it is traded, the original owner of the DATA doesn’t lose it. I’ll have to see how and whether that leads to sensible costing for DATA - I suspect that I’ll have to introduce knowledge of prior prices or earlier deals into the model to make it work.

How do the agents interact with each other? With the environment? How expansive or focused are agent interactions?

The Firms will interact with each other through trade. They can trade either DATA or FOOD for GOLD or vice versa, with a willing partner.

Where might the data come from, especially on agent behaviours, for such a model?

I aim to dig into the existing macroeconomic studies to see what estimates they use for the added productivity associated with access to data. I can also base behaviours on the Open data means business study we did at ODI. I’m open to other suggestions.

How might you validate the model, especially the agent behaviours?

There are statistics on the distribution of Firm sizes and sectors within the UK economy that could be used to validate the model. We can see whether instances emerge that match some of the case studies that we use. Again, I’m open to sugestions.

Modelling software

There’s a whole bunch of agent-based modelling software out there. I’ll admit when I looked at that list I was put off by the number of Java-based implementations, because it’s been something like 15 years since I last wrote Java.

I did like the look of ABCE because it’s Python (even though I don’t know Python very well either) and has a lot of the basic economic model that’s needed built-in. However, I was concerned that it might be hard to adapt it to an information economy because its primitive operations assume the transfer of physical goods, and about its relative immaturity.

Eventually I’ve settled on using Repast Simphony. The main reason for adopting it is the level of support that’s available, the easy-to-follow tutorial being a good example.

Final thought

I didn’t know anything about this field - not even that it was called agent-based computational economics - when I started looking on Friday night. By Sunday evening I had written my first agent-based model using the Repast tutorial and had a reasonably good idea about what to try first.

It’s at times like these when you realise quite how amazing the web is, how transformational. I can’t even imagine how I would have gotten started looking at this without it. But you also realise that we make this web through being generous with our time and knowledge, and not just on Wikipedia and Twitter. Take a look at these fantastic pages maintained by Leigh Tesfatsion on agent-based computational economics. From the HTML I guess they were started some time ago; the dates indicate they are still maintained. They contain tutorials, references to prior work, pointers to people working in the field. I have barely scratched the surface.

You also realise the tragedy of restricted access to academic research. The papers whose abstracts seem interesting but probably not £30 interesting. The book chapters that you can read snippets of, or pay £80 for. A world where those outside academia can’t easily access academic research is one where that research can have little impact.

I’ve created a repository on Github which I’ll use for code as it comes (currently all it has is the README and an open licence!). If you have any comments, advice or suggestions then please use the issue list there.