I recently started working on a medical research project for hemophilia. I dreamed about working on projects that have a big impact but never thought that the opportunity will come so fast! Thank you Apple for releasing ReseachKit!
A bit about Hemophilia
Hemophilia is a group of hereditary genetic disorders that impairs the body’s ability to control blood clotting. There are two types of hemophilia type 8 and type 9 – the name comes from the missing clotting factor. The treatment is pretty straight forward: haemophiliacs inject themselves with the clothing factor every couple of days to maintain their factor level in a normal rage.
People with hemophilia have lower clotting factor level of blood plasma or impaired activity of the coagulation factors needed for a normal clotting process. Thus when a blood vessel is injured, a temporary scab does form, but the missing coagulation factors prevent fibrin formation, which is necessary to maintain the blood clot. A haemophiliac does not bleed more intensely than a person without it, but can bleed for a much longer time. In severe haemophiliacs even a minor injury can result in blood loss lasting days or weeks, or even never healing completely. In areas such as the brain or inside joints, this can be fatal or permanently debilitating.
I got really excited after reading a about this disease. I know – a bit creepy – but here are some interesting facts I found out about Hemophilia:
It’s almost a guy thing
One of the first things I’ve learned about Hemophilia was that 1 in about 5.000-10.000 males have factor 9 deficiency and 1 in 20.000-34.000 males have factor 8 deficiency. I wondered for a while why this happens. It seems that the genetic defect is on the X chromosome. You might remember from biology that there are two sex chromosome: X and Y. Males have XY and females have XX. Because of that if a male has a buggy X chromosome it will always manifest. The chance of a female having two defective copies of the gene is very remote, so females are almost exclusively asymptomatic carriers of the disorder.
Although it is not impossible for a female to have haemophilia, it is unusual: daughters which are the product of both a male with haemophilia A or B and a female carrier will have a 50% chance of having haemophilia. The genes on the X chromosome can “mix and match” meaning the body can decide which part of which gene from the 2 X chromosomes to use. So, a “normal” gene and a severe hemophilia gene can produce anywhere from normal levels to severely deficient levels. The mechanism behind this is poorly understood. It fortunately leans strongly toward “normal” so severe hemophilia in women remains very rare. It does mean that many women are currently undiagnosed.
It has no cure, yet!
The current treatment is periodically infusing the clotting factor into your system. Yep – needles every few days if not more than once a day in severe cases. There are two recent improvements in treatment: products with superior half-life which require less frequent infusions and promising results with gene therapy.
In December 2011, a team of British and American investigators reported the successful treatment of haemophilia B using gene therapy. The investigators inserted the F9 gene into an adeno-associated virus-8 vector, which has a propensity for the liver, where factor 9 is produced, and remains outside the chromosomes so as not to disrupt other genes. The transduced virus was infused intravenously. To prevent rejection, the people were primed with steroids to suppress their immune response.
Source: Nicholas Wade (10 December 2011). “Treatment for Blood Disease Is Gene Therapy Landmark”. The New York Times.
It has an interesting history and community behind it
Nothing brings a group together like shared pain. Well, this group of people have been through a lot together.
The clotting factor was made by processing donated blood. Before 1985 in the US there were no laws that mandated the screening of donated blood. Contaminated blood products where a serious public health problem in the late 1970s through 1985. A as a result a large number of haemophiliacs got infected with HIV and hepatitis C, estimations range from 6.000 to 10.000 in the US alone – that’s between a third and half the total number of haemophiliacs. Honestly, I can’t believe they don’t teach us about this in school!
Clotting factor and half-life
Half-life is the amount of time required for the amount of something to fall to half its initial value. The term is very commonly used in nuclear physics to describe how quickly unstable atoms undergo radioactive decay, but it is also used more generally for discussing any type of exponential decay.
Clotting factor that has been infused does not stay in the body but is used up. The rate at which it is used up is called its “half-life”. Here is how the level fluctuates over a few days:
Factor 9 tends to have a big half-life ranging from 24 hours up to 90 hours, while factor 8 has a half life of 8-12 hours.
Let’s model an infusion
Three data points define an Infusion
the time of the infusion, the level increase or dose and the half life used for that product.
struct Infusion {
var startingLevel: Double
var startTime: Time
var halfLifeUsed: Double
init(startingLevel: Double, startTime: Time, halfLifeUsed: Double) {
self.startingLevel = startingLevel
self.startTime = startTime
self.halfLifeUsed = halfLifeUsed
}
}
The individual half life can vary from one individual to another. To better approximate your factor level you can get your blood tested and find out whats your individual half-life. The property is named halfLifeUsed
instead of halfLife
to underline this fact.
Another thing you may have noticed is that I’ve used a Time
type. Time
is just an alias for Int
. I wasn’t sure if time would best represented as Int
or Double
so I made it interchangeable.
typealias Time = Int
I’ve also wrote some helper functions to write time intervals in more humane way:
1.minute // instead of 60
1.hour + 10.minutes // instead of 4200
10.days.ago // Instead of NSDate(timeIntervalSinceNow: NSTimeInterval(-864000)).timeIntervalSince1970
extension Time {
var years: Time {
return self * 365.days
}
var year: Time { return years }
var days: Time {
return self * 24.hours
}
var day: Time { return days }
var hours: Time {
return self * 60.minutes
}
var hour: Time { return hours }
var minutes: Time {
return self * 60.seconds
}
var minute: Time { return minutes }
var seconds: Time {
return self
}
var second: Time { return seconds }
var ago: Time {
return NSDate(timeIntervalSinceNow: NSTimeInterval(-self)).asTime
}
static var minTime: Time {
return NSDate.minTime
}
static var now: Time {
return NSDate().asTime
}
}
extension NSDate {
var asTime: Time {
return Time(timeIntervalSince1970)
}
class var minTime: Time {
return distantPast().asTime
}
class var maxTime: Time {
return distantFuture().asTime
}
}
Getting the the factor level at a point in time
To find out the level at a moment in time we use the formula:
StartingLevel * 0.5^(∆t/HalfLife)
∆t
is the number of hours that passed since the infusion
If no time passed since the infusion 0.5^(∆t/HalfLife)
will be 1 because 0.50 = 1 If 1 half-life passed0.5^(∆t/HalfLife)
will be 0.5 because 0.51 = 0.5. If 1 half-life passed 0.5^(∆t/HalfLife)
will be 0.25 because 0.52 = 0.25 This continues to half with each passing half-life.
We can implement this formula as a method on Infusion
:
struct Infusion {
...
func factorLevel(time: Time) -> Double {
guard startTime <= time else {
return 0
}
let dt = Double(time-startTime)/Double(1.hour)
return startingLevel*pow(0.5, dt/halfLifeUsed)
}
}
I’ve also added a subscript for convenience:
struct Infusion {
...
subscript(time: Time) -> Double {
return factorLevel(time)
}
}
Lets take it for a spin:
var time = 10.days.ago
let infusion = Infusion(startingLevel: 100,
startTime: time,
halfLifeUsed: 24)
var dayCount = 0
while time < .now {
let level = infusion[time]
print("On day \(dayCount) - level \(level)")
time += 1.day
dayCount += 1
}
Output:
On day 0 - level 100.0
On day 1 - level 50.0
On day 2 - level 25.0
On day 3 - level 12.5
On day 4 - level 6.25
On day 5 - level 3.125
On day 6 - level 1.5625
On day 7 - level 0.78125
On day 8 - level 0.390625
On day 9 - level 0.1953125
Adding infusions
In real life you have more than one active infusion in your system. Adding the levels of the recent ones will give the total level.
struct InfusionSet {
var infusions: [Infusion]
subscript(time: Time) -> Double {
return factorLevel(time)
}
init(infusions: [Infusion]) {
self.infusions = infusions
}
func factorLevel(time: Time) -> Double {
return infusions.reduce(0) { sum, infusion in
sum + infusion[time]
}
}
Let’s simulate a simple treatment: one infusion every 3 days with a 80 percent level increase. We will compute the factor level every 15 minutes.
let nInfusions = 10
let nDaysBetweenInfusions = 3
var infusions: [Infusion] = (1...nInfusions).map { i in
Infusion(startingLevel: 80, startTime: Time(i * nDaysBetweenInfusions).days.ago, halfLifeUsed: 24)
}
let infusionSet = InfusionSet(infusions: infusions)
var time = Time(nInfusions * nDaysBetweenInfusions).days.ago - 6.hours
let end = 5.days + .now
while time < end {
let level = infusionSet[time]
time += 15.minutes
}
Here is a graph of the factor level values obtained:
Performance
I’ve made a few tests to see how the naive solution performs:
number of infusions | time (seconds) |
---|---|
10 | 0.005 |
50 | 0.019 |
100 | 0.057 |
250 | 0.339 |
1000 | 3.742 |
I ran the tests on my laptop. I’m not sure exactly how much but I expect the code to run slower on a phone.
One of the things we are trying to make is a tool for comparing different treatments. You can select the product you want to use, the schedule and other variables like the target factor level and the minimum level. The app would simulate the infusions and show a chart with the level over time and would calculate things like how many units you will infuse in a year or the time you spend with a high bleeding risk.
An average treatment might have about 100 infusion in a year. Using this solution to compute the graph while playing around with different variables will give us only a few frames per second which is not ideal.
Optimising the solution
The current solution has the complexity O(NumberOfInfusion * NumberOfSamples)
because at each point in we compute the value of every infusion.
We notice that an infusion only has values after its start time. Also that it only has a relevant value for a few days. One idea would be to find only the infusions that have relevant values at each point – about 3-5 in the average case.
Finding relevant value ranges
First thing we need to do is to find moment an infusion reaches a minimum level. Values under 1% can be safely ignored because it’s really hard to overdose with clotting factor.
To find out when an exponential curve reaches a certain value we need to use a bit of math. Please don’t get scared of logarithms! Vi Hart has an awesome video in which she explains that logarithms are nothing but fancy counting.
The implementation is only a few lines of code:
struct Infusion {
...
func timeToReachLevel(endLevel: Double) -> Time {
let timeToLevel = halfLifeUsed * (log(endLevel / startingLevel) / log(0.5))
return startTime + Time(timeToLevel * Double(1.hour))
}
}
Because we don’t have logarithm function with a variable base we had to use the change of base formula:
logb x = loga x / loga b
log(endLevel / startingLevel) / log(0.5)
The code above actually means:
log0.5 endLevel
/ startingLevel
Or how many half-lives have to pass to get from startingLevel
to endLevel
.
Math functions tend to each up considerable time – we are going to save the time each infusion reaches a minimum level in a property called endTime
.
struct Infusion {
var startingLevel: Double
var startTime: Time
var halfLifeUsed: Double
var endTime: Time
let minLevel = 0.5
init(startingLevel: Double, startTime: Time, halfLifeUsed: Double) {
self.startingLevel = startingLevel
self.startTime = startTime
self.halfLifeUsed = halfLifeUsed
endTime = 0 // needed before calling any methods
endTime = timeToReachLevel(minLevel)
}
...
}
Now we know that each infusion have relevant values between startTime
and endTime
– for all other times we consider the value to be 0
.
Ordering Infusions
One idea would be to order infusions by their start and end time. When we compute the level at a certain time we can use binary search to quickly find all the functions we are looking for.
Thanks to Swift generic standard library all we need to do to order an array of infusions by calling sort()
is to makeInfusion
Comparable
:
let sortedInfusions = infusions.sort()
To conform to Comparable
we need to define the lower than <
and equal ==
operators.
struct Infusion: Comparable {
...
}
func <(lhs: Infusion, rhs: Infusion) -> Bool {
if lhs.startTime == rhs.startTime {
return lhs.endTime < rhs.endTime
} else {
return lhs.startTime < rhs.startTime
}
}
func ==(lhs: Infusion, rhs: Infusion) -> Bool {
return lhs.startTime == rhs.startTime &&
lhs.halfLifeUsed == rhs.halfLifeUsed &&
lhs.startingLevel == rhs.startingLevel
}
Binary search
Binary search is an algorithm that reduces the search space by half at each step. You can read more about binary search here.
Instead of implementing the classical version of binary search I’m going to use a similar algorithm that reduces the search space in half at each step. It does almost the same steps as binary search would – I prefer this version for two reasons: 1) I’ve been a competitive coder for a long time and I always messed up the implementation of binary search with left
,right
and mid
– but with this approach it was the other way around it almost always worked once I wrote it 2) you can easily change to code to get different results or adapt it to more than one dimension
Before I do the fancy stuff I’m going to implement a brute force algorithm and then optimise it.
It all begins with a few infusions ordered by their startTime
and endTime
:
let infusions = aFewInfusions.sort()
A moment in time:
let time = 3.days.ago
And a question: what is the first infusion that is still active at time
?
infusions
is ordered ascending by their startTime
. One solution would be to go over each one:
var index = 0
while index < infusions.count {
if index + 1 < infusions.count &&
infusions[index + 1].endTime < time {
index += 1
} else {
break
}
}
index // is the index of the first infusion that is still active at time - or 0 if all infusions start after time
What this algorithm does is check one by one if a infusion has values in time
. It does this by finding the first infusion that has endTime
after time
. If all infusions start after time
index
will remain unchanged.
if index + 1 < infusions.count &&
infusions[index + 1].endTime < time {
index += 1
} ...
index + 1 < infusions.count
checks is we can go one element to the right infusions[index + 1].endTime <= time
checks if that infusion ended before time
Now we can compute the current factor level by adding all the infusions that have values after index
var level: Double = 0
while index < infusions.count && infusions[index].startTime <= time {
level += infusions[index][time]
index += 1
}
level // 80.3
We can optimise the first part of the algorithm instead by taking bigger steps. Instead of going only one element to the right we can exclude more at once. Why? Because if one infusion before time
all the rest up to that point also ended beforetime
.
We implement this by adding a step
variable to our algorithm. The initial step is half the number of infusions. If we can’t jump step
infusions we try again after halving the step
until the step is 0
.
var step = infusions.count / 2
var index = 0
// find the first relevant infusion
while step > 0 {
if index + step < infusions.count && infusions[index + step].endTime < time {
index += step
} else {
step /= 2
}
}
Getting all the values
We notice that if we want to compute the factor level every 15 minutes the result from the first part of the algorithm is almost the same for consecutive times.
If we persist that start index from one time to another we can make this solution O(numberOfSamples + numberOfInfusions)
struct InfusionSet {
...
func enumerate(every step: Time, callback: (Time, Double) -> ()) {
guard infusions.count > 0 else {
return
}
var startIndex = 0
var time = infusions.first!.startTime - 6.hours
let end = infusions.last!.endTime
while time <= end {
while startIndex + 1 < infusions.count && infusions[startIndex + 1].endTime < time {
startIndex += 1
}
var level: Double = 0
for var index = startIndex; index < infusions.count && infusions[index].startTime <= time; index += 1 {
level += infusions[index][time]
}
callback(time, level)
time += step
}
}
}
Now we can iterate over all the values in no time:
infusions.enumerate(every: 1.hour) { time, level in
...
}
Performance with this solution:
number of infusions | time (seconds) |
---|---|
10 | 0.004 |
50 | 0.004 |
100 | 0.004 |
250 | 0.005 |
1.000 | 0.0013 |
100.000 | 0.952 |
With this I was able to make a prototype for the treatment risk calculator. The data is generated pretty fast. Unfortunately I couldn’t find a chart library that can reload more than 1000 data points at 60 frames per second.
Conclusions
A bit of optimisations and math can take you a long way. Do your homework!
I want to thank Dan Bond for inviting me to join his project and for reviewing this article. The project will be open source and I will be writing more about it as it evolves. At the moment we are rewriting Dan’s app iBleed and experimenting with the treatment risk calculator.