Job Type: Software Development
We are working on decentralizing information and credibility distribution.
Society suffers when small groups control what others get to know and believe. Be it through controlling the broadcasting (tv, radio, newspapers, access to social media platforms) or credibility signaling (who should be trusted and who shouldn’t).
We are building an influence algorithm. In other words, we are trying to find ways to describe groups of people mathematically. Many tried and failed before. But we think we can make it work.
Our core hypothesis is that influence can be quantified by tracking attention flows. In order to do that, we ingest data streams from multiple sources (we started with Twitter and are now indexing podcasts and soon more). We then cross-reference these datasets in an attempt to continuously improve the accuracy.
The accuracy of our work is being verified by members of the groups that we aim to describe. We publish our results in real-time and there are thousands of people already using our scores. It is hard to verify when we are right. But it is very easy to tell when we are wrong. This short feedback loop puts us in a unique position to work on problems that might be much harder or impossible to solve somewhere else.
We are a small, VC-funded startup. We are a remote-first team. Most of the team is based in Europe (Berlin, London, Barcelona). You can make your own hours, but everybody is expected to be online during office hours in CET. We try to meet in person and work together for several days at least every 3 months. Other than that the company ‘lives’ in Slack, Notion and other tools enabling effective communication.
The job is full time permanent. If you work from Germany, you will need a German work permit and you will get an employment contract, if you work from somewhere else in the world, you will work as a freelancer, and need to fulfill the legal requirements for freelancing in your country.
If you work from Berlin, you can work from our Berlin office, located in Mitte.
You will employ creative solutions to problems working across Algorithm, Platform and Product teams. You will ensure that we are deploying optimal solutions to the most pressing problems.
Your ultimate goal is to ensure that the Algorithm team receives the best possible data set. You will use this objective to define and prioritize your own tasks. This does not mean that you will always work directly on obtaining the dataset required by the Algo team. In some cases, you may decide that the best way to help the Algo team to achieve their objective is to first provide another data stream to the Product team. This, in turn, will result in obtaining user-generated data that enables you to refine the data stream initially requested by the Algo team.
You will be responsible for designing the data structure. You will have to account not only for current challenges, but also anticipate what data we will be collecting in the years to come. You will make sure that our architecture can handle all kinds of data that we are looking to index.
This, of course, means that you have to have a deep understanding of what the data is going to be used for. You need to be able to evaluate what level of accuracy is required and what can be achieved with various approaches and use this information to decide which strategy to pursue.
You will think about these problem in a very broad context of the whole organization. You need to understand dependencies of each team, data provider and technical limitations. This role requires a combination of both excellent people & technical skills.
You will own the whole lifecycle of collecting, cleaning and refining our data.
Collecting - Cleaning - Refining
You will work with multiple APIs, RSS feeds, scrapers and any other way that you will find fit to collect relevant data. This may also include designing processes and incentives for users to willingly provide data directly to us.
You will be responsible both for identifying the best source of a given data and the technical execution. You will make sure that all of our processes are legal and ethical and result in high quality data while fulfilling these criteria.
You will employ various techniques in order to make sure that the data is as clean and correct as possible. You will have flexibility to explore various technical approaches, but also experiment with system design and/or social design solutions.
For example, your tasks may include:
We are in a unique position where we can leverage both multiple streams of data and user input. You will be responsible for making sure we are taking full advantage of this. For example, it is likely to be easier to perform 90% of the labelling using an algorithmic solution. But the remaining 10% is very hard to get right through automation, but it's trivial to do for the user. It's also a manageable size of data set to be labelled by a human. In this case, you would need to:
EUR 50,000 - 70,000 per year
Equity 0.14 - 0.18 %