Infection, imitation and a hierarchy of computer viruses


computers & securi ty 25 ( 2006) 469 473
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/cose
Infection, imitation and a hierarchy of computer viruses
Zhi-hong Zuo*, Qing-xin Zhu, Ming-tian Zhou
College of Computer Science and Engineering, University of Electronic Science & Technology of China, Chengdu, Sichuan 610054, PR China
a r t i c l e i nf o a b s t r a c t
Article history: Infection is an essential character of computer viruses. In addition, computer viruses can
Received 19 May 2005 also imitate the behavior of infected programs in some ways in order to hide themselves. In
Revised 12 October 2005 this paper we define infection and imitation mathematically, and classify computer viruses
Accepted 6 February 2006 into 3 types according to their different imitation behaviors. Furthermore, we give some re-
sults about the degree of unsolvability of each type of computer viruses. We show that the
Keywords: set of type 0 and type 1 computer viruses is P2-complete, while the set of type 2 computer
Computer viruses viruses is P3-complete.
Infection ª 2006 Elsevier Ltd. All rights reserved.
Imitation
Complete sets
Hierarchy
1. Introduction have been proposed so far (Thimbleby et al., 1999; Jian et al.,
2003; Chang and Shao-Ren, 2001; Zuo and Zhou, 2004). In a
The first abstract theory of computer viruses is the viral set recent paper (Zuo and Zhou, 2004), we improved Adleman s
theory given by Cohen, based on the Turing machine (Cohen, definitions of computer viruses to comply with the common
1989, 1994). A viral set is defined by (M, V) where M is a Turing understanding of computer virus, and proved that the set of
machine and V is a non-empty set of programs on M. Each computer viruses with the same kernel is P2-complete. In gen-
v Û V is called a computer virus and satisfies the following con- eral they formed a S3-complete set. We have also proved the-
ditions: if it is contained in the tape at time t, then there is oretically the existence of some computer viruses that have
a time t0 and a v0Û V such that v0 is not contained in the tape not been discovered yet (for example, the polymorphic viruses
at time t, but contained in the tape at time t0. The most impor- with infinite forms). In another paper (Zhi-hong et al., 2005),
tant one of Cohen s theorems is about the undecidability of we investigated the time complexity of computer viruses.
computer viruses (Cohen, 1989, 1994). Infection is the key character of computer viruses. In addi-
In a different approach, Adleman (1988) developed an tion, computer viruses often imitate the behavior of the
abstract theory of computer viruses based on recursive func- infected programs in some ways in order to hide themselves.
tions. In his definition a virus is a total recursive function v Different imitation behaviors lead to different mathematical
which applies to all programs x (the Gödel numberings of pro- features. In this paper we define infection and imitations
grams) such that v(x) has characteristic behaviors of computer mathematically, obtain a hierarchical structure of computer
viruses such as injury, infection and imitation. Furthermore, viruses according to their different imitation behaviors and
Adleman (1988) proved that the set of computer viruses is prove the strict inclusions.
P2-complete. The structure of this paper is as follows: in Section 2 we
There are some shortcomings in the computer virus introduce some basic concepts and notations; in Section 3
models given by Cohen and Adleman. Several improvements we give the definitions of infection, imitation and a hierarchical
* Corresponding author.
E-mail address: zhzuo@uestc.edu.cn (Z.-hong Zuo).
0167-4048/$  see front matter ª 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cose.2006.02.001
470 computers & securi ty 25 ( 2006) 469 473
structure of computer viruses. In Section 4 we give some Imitation is a property upon which computer viruses rely
important theorems and prove them. In Section 5, we give a to behave like the original programs. It is not indispensable
brief summary and some discussion for these results. for computer viruses, but most currently found computer
viruses do have imitation property.
2. Preliminaries
Definition 2. (Imitation) A total recursive function v is said to be
imitative if it satisfies
We describe some notations below.
ci Wi >2/dCd;pDÛWiXWvðiÞ fvðiÞðd;pÞźfiðd;pÞ : (2)
LetNbe the set of all natural numbers and S be the set of all
finite sequences of natural numbers. For s1;s2;.;sn Û S, let
Cs1;s2;.;snD denote a computable injective function from Sn
toNand its inverse is also computable. If f :N/Nis a partial
Imitation property makes the infected program v(i) behave
function, for s1;s2;.;sn Û S, we write fðs1;s2;.;snÞinstead of
in some computations like the original program i, i.e., there
fðCs1;s2;.;snDÞ. Similarly, for i1;i2;.;in ÛN, let Ci1;i2;.;inD de-
exist some environments (d, p), fv(i)(d, p)źfi(d, p).
note a computable injective function fromNn toN, satisfying
Ci1;i2;.;inD im for all 1 m n, and its inverse is also com-
Definition 3. (N-Imitation) A total recursive function v is said to
putable. We also use fði1;i2;.;inÞto represent fðCi1;i2;.;inDÞ.
be N-imitative if it satisfies
For a sequence pźði1;i2;.;ik;.;inÞÛ S, let p(i) denoteitsith
element, and x Ûs p represent that x is in the sequence p, i.e.,
ci Wi is infinite/dNCd;pDÛWiXWvðiÞ fvðiÞðd;pÞźfiðd;pÞ ;
xźp(i) for some i. For s1;s2;.;sn Û S, x Ûsðs1;s2;.;snÞmeans x
(3)
Û si for some 1 i n. Let p[jk/ik] denote the sequence obtained
s
ÿ
by replacing ik with jk in p, i.e., p jk=ik ź i1;i2;.;jk;.;in . If v is where symbol dN means existing infinitely many.
a computable function, p[v(ik)/ik] is simply written as p½vðikÞŠ. If
more than one element in p is replaced or evaluated by some N-Imitation property not only requires the infected
computable functions, we write the result as p jk =ik ;jk =ik ; program v(i) behave in some computations like the original
1 1 2 2
ÿ ÿ ÿ
.;jk=ik or p v1 ik ;v2 ik ;.;vl ik , respectively. program i, but also requires infinitely many of these
l l 1 2 l
Adopting Adleman s (1988) notations, let fP(d, p) denote computations.
a function computed by a computer program P in the running
environment (d, p) where d represents data (including clock, Definition 4. (Computer virus) A computer virus is a total a
spaces of diskettes and so on) and p represents programs (in- recursive function v satisfying the following conditions:
cluding operating systems) stored on computers. If the index
(the Gödel numbering) of P is e, the function is also denoted (1) v has infection property, or
by fe(d, p). The domain and range are denoted by We and Ee, re- (2) v has both infection and imitation property, or
spectively. If h is a recursive function, we also use the symbols (3) v has both infection and N-imitation property.
Wh and Eh for its domain and range. It is worth noting that
there is no essential distinction between d and p, as in the Computer viruses satisfying the above conditions (1), (2) or
case of Von Neumann machines. In this paper we use the (3) are called type 0, type 1 or type 2 viruses, respectively. The
symbol (d, p) just for easier understanding. set of type 0, type 1 and type 2 viruses are denoted by V0, V1
and V2, respectively. Obviously V0 J V1 J V2.
3. Definitions of infection, imitation, and the
4. Main results
hierarchy of computer viruses
In this section we prove our main results, using the traditional
In the following we give definitions of infection and imitation
notations and symbols of recursive function theory (Rogers,
first, and then derive the hierarchy structure for computer
1967; Soare, 1987).
viruses.
A computer virus can be viewed as a total recursive func-
Proposition 5.   v is infective  is a P2-predicate.
tion v which applies to every program i and obtains its infected
form v(i) such that v(i) can infect other programs (or MBR as
Proof. From Definition 1, it follows that
well as some documentations) under some conditions
(Adleman, 1988). In more technical terms, an infected pro-   v is infective 
gram v(i), when given some input (or environments) (d, p), its
5ci WisB/dCd;pDÛWvðiÞ dx vðxÞÛsfvðiÞðd;pÞ (4)
output fv(i)(d, p) should contain some other infected programs
v(x). It leads to the following definition of infection.
5ci ðWiźBÞn dCd;pDÛWvðiÞ dx vðxÞÛsfvðiÞðd;pÞ : (5)
Definition 1. (Infection) A total recursive function v is said to
Since
be infective if it satisfies
WiźB5cx:x;Wi (6)
xÛsfyðzÞ5di xźfyðzÞðiÞ ; (7)
ci WisB/dCd;pDÛWvðiÞ dx vðxÞÛsfvðiÞðd;pÞ : (1)
computers & securi ty 25 ( 2006) 469 473 471
we know that   WiźB  and   x Ûs fy(z)  are a P1-predicate and
vÛV15ðvÛV0Þ^ðv is imitativeÞ; (15)
a S1-predicate, respectively. Because   x Û Wy  is also a S1-
by Propositions 6 and 7, we know that V0 and V1 are P2-sets.
predicate (Rogers, 1967),   v is infective  is a P2-predicate. ,
Let A be any P2-set, R(x, y, z) be the recursive predicate
satisfying x Û A 5 cydzR(x,y,z). Let a be an integer, assume
Proposition 6.   v is imitative  is a P2-predicate.
Rź{a} and by Lemma 8 we have J(e,y). Let bźJ(i, my(J
(i, y) s a)). Clearly b is a recursive function of i. For a given
Proof. From Definition 2, it follows that
number m, define
  v is imitative 
8
>
>Cd;p;fkðmÞD; ifððCd;pDźaÞCd;pDźbÞÞ
5ci jWij>2/dCd;pDÛWiXWvðiÞ fvðiÞðd;pÞźfiðd;pÞ (8) <
^ðcyÿ
fði;k;x;Cd;pDÞź
>fiðd;pÞ; if Cd;pDÛEa ^ðcy > i
:
undefined; otherwise
5ci ðjWij 2Þn dCd;pDÛWiXWvðiÞ fvðiÞðd;pÞźfiðd;pÞ :
(16)
(9)
f(i, k, x, Cd, pD) can be computed by the following procedure.
Since
Given (i, k, x, Cd, pD), compute Jði;0Þ;Jði;1Þ;. starting from 0.
jWij>25dx;y;zÛWiðxsy^ysz^xszÞ; (10)
Let Cd, pDźJ(i, j ), when a value of Cd, pD is computed, we com-
pute the sequence Rðx;0;0Þ;.;Rðx;Cd;pD;0Þ;R.ðx;0;1Þ;.;
Rðx;Cd;pD;1Þ;.. If for every ywe know that   jWij>2  is a S1-predicate, hence   jWij 2  is
value of R(x, y, z) is 1 (true), then check if Cd, pD is equal to a or
a P1-predicate. Since   x Û Wy  is a S1-predicate,   v is imita-
b (provided b exists); if equal, the procedure outputs Cd, p,
tive  is a P2-predicate. ,
fk(m)D; otherwise outputs fi(d, p); in other situations (including
the case where b does not exist), the procedure does not termi-
nate, that is, f(i, k, x, Cd, pD) is undefined. By Church s thesis, f(i, k,
Proposition 7.   v is N-imitative  is a P3-predicate.
x, Cd, pD) is a recursive function.
By s m n theorem, there exists a total recursive function
Proof. From Definition 3, it follows that
b(i, j, k) satisfying
  v is N-imitative 
8
>
>Cd;p;fkðmÞD; ifððCd;pDźaÞCd;pDźbÞÞ
<
5ci Wi is infinite/dNCd;pDÛWiXWvðiÞ fvðiÞðd;pÞźfiðd;pÞ
^ðcyÿ
fbði;k;xÞðd;pÞź
>fiðd;pÞ; if Cd;pDÛEa ^ðcy> i
(11)
:
undefined; otherwise
(17)
5ci ðWi is finiteÞn cxdCd;pDÛWiXWvðiÞ ðCd;pD>xÞ
By the recursion theorem with parameters, there exists a
^ fvðiÞðd;pÞźfiðd;pÞ : ð12Þ
total recursive function n(x) such that fn(x)(i)źb(i, n(x), x), hence
8
Since   Wi is finite  is a S2-predicate (Rogers, 1967), and
>
>Cd;p;fxÞðmÞD; ifððCd;pDźaÞCd;pDźbÞÞ
<
  x Û Wy  is a S1-predicate,   v is N-imitative  is a P3- ^ðcyÿ ÿ
ff ðiÞðd;pÞź

>fiðd;pÞ; if Cd;pDÛEa ^cypredicate. ,
> i
:
undefined; otherwise
(18)
In the proofs of our main results, we also need the follow-
ing lemma.
If x Û A, then cydzR(x, y, z), therefore
8
Lemma 8. For any non-empty recursively enumerable set R, there is
ff ðiÞðd;pÞź fiðd;pÞ; ifCd;pDÛEa (19)
S
xÞ i
:
a recursive function J(e, y), such that Rng(ly.J(e, y))źWe R for
undefined; otherwise
any e. The set is also written as ER.
e
For any i, if a Û Wi or b Û Wi, in both cases
fxÞðmÞÛsff ðiÞðd;pÞ, i.e., fn(x) is infective. Moreover, assume
Proof. Let RźRng( g), here g is a recursive function. Define xÞ
8 jWij>2, then jWiÿ{a, b}j>0. From Eq. (19), we have
gðsÞ; if nź2sþ1
<
ff ðiÞðd;pÞźfiðd;pÞfor any Cd, pD Û Wiÿ{a, b}, i.e., fn(x) is imita-

fðe;x;nÞź x; if nź2s;xÛWe;sþ2ÿWe;s (13)
:
tive. Hence, for any x Û A, n(x) Û V1.
gð0Þ; otherwise
If x ; A, then dycz:R(x, y, z). Let y0 be an integer such that
where We,s is defined as in Soare (1987). Clearly f(e, x, n) is a
cz:R(x, y0, z), and let Wcź{Cd, pD:Cd, pD>y0}, for any Cd, pD Û Wc,
total recursive function. Let J(e, y)źf(e, l( y), r( y)), we have
we have that
the conclusion. ,
y0 Theorem 9. V0 and V1 are P2-complete sets.
Thus ff ðcÞðd;pÞ is undefined, that is, for any m,

fxÞðmÞ;ff ðcÞðd;pÞ. Hence n(x) is not infective, so that

Proof. Since
n(x) ; V0.
ÿ
In conclusion, A m V1;V0 . Hence V0 and V1 are P2-
vÛV05v is infective (14) complete sets. This completes the proof of the theorem. ,
472 computers & securi ty 25 ( 2006) 469 473
Theorem 10. V2 is a P2-complete set. Because V2 is a P3-complete set while V1 is a P2-complete
set, hence V1 s V2. Given m, define a function v such that (by
Proof. Since v Û V2 5 v Û V0^  v is N-imitative  , by Theorem 9 recursion theorem)
and Proposition 7 V2 is a P3-set.
fvðiÞðd;pÞźCd;p;vðmÞD: (27)
Let A be any P3-set, hence there is a S2-predicate R(x, y)
Clearly v Û V0 but v ; V1, hence we have the following
such that x Û A 5 cyR(x, y). Since the set {x: Wx is finite} is
theorem.
S2-complete, there is a recursive function g(x, y) satisfying
x Û A 5 cy(Wg(x, y) is finite).
Theorem 11. V0 I V1 I V2
For any integer a, let Rź{a} inLemma 8 we get the function
J(e, y). For a given i, let
5. Discussion
b1źJði;myðJði;yÞsaÞÞ; (21)
Definition 1 is a reasonable description for infective property
b2źJði;myððJði;yÞsaÞ^ðJði;yÞsbÞÞÞ: (22)
of computer viruses. But it is not the strongest definition.
It is obvious that b1 and b2 are recursive functions with
The strongest definition implies that no matter whether Wi
respect to i, and a s b1 s b2.
is empty or not, program fv(i) is infective. That is,
Given m, define
8
cidCd;pDÛWvðiÞ dx vðxÞÛsfvðiÞðd;pÞ : (28)
>
>Cd;p;fkðmÞD; ifðCd;pDźaÞCd;pDźb1Þ
<
fiðd;pÞ; ifCd;pDźb2
fði;k;x;Cd;pDÞź
Under such condition we can prove that V1 is P2-complete
>fiðd;pÞ; ifcy iðJðgðx;yÞ;Cd;pDÞźaÞ
>
:
and V2 is P3-complete, using the same arguments as in Theo-
undefined; otherwise
rems 9 and 10. But to prove that whether V0 is P2-complete or
(23)
not is still an open problem.
f(i, k, x, Cd, pD) can be computed by the following procedure.
The condition   Wi is not empty  in Definition 1 is replaced
Given (i, k, x, Cd, pD), first compute b1 and b2. If Cd, pD equals
by the condition   jWij>2  in Definition 2. If we only consider
a or b1 (provided b1 exists), then compute Cd, p, fk(m)D; if Cd, pD
imitative property, we may use the condition   Wi is not
equals b2 (provided b2 exists), then compute fi(d, p); otherwise
empty  . If we consider infective property together with imita-
compute J( g(x, 0), Cd, pD), J( g(x, 1), Cd, pD), ., J( g(x, i), Cd, pD), if
tive property, this is not a proper condition for ifjWijź1 the
all the values of the sequence are equal to a, then compute
program fv(i) cannot satisfy both infective and imitative prop-
fi(d, p); for any other cases (including the case when b1 and
erty. Although   jWij>1  is the best substitute for the condi-
b2 do not exist), f(i, k, x, Cd, pD) are undefined. By Church s the-
tion   Wi is not empty  , we do not know if V1 is P2-complete
sis, f(i, k, x, Cd, pD) is a recursive function.
under this condition.
Similar to Theorem 9, there exists a recursive function n(x)
The conclusion of Theorem 9 complies to some extent with
satisfying
Adleman s result on computer viruses (Adleman, 1988). That
8
is, the decision problems for type 0 and type 1 computer
>
>Cd;p;fxÞðmÞD; ifðCd;pDźaÞCd;pDźb1Þ
<
fiðd;pÞ; ifCd;pDźb2 viruses are unsolvable, and the degree of unsolvability is 2
ff ðiÞðd;pÞź

>fiðd;pÞ; ifcy iðJðgðx;yÞ;Cd;pDÞźaÞ
> (that is, solving these problems are harder than solving halt-
:
[; otherwise
ing problem). Theorem 10 shows that the decision problem
(24)
of type 2 viruses is even harder than that of type 0 and type
1 computer viruses. Most computer viruses currently found
and n(x) is both infective and imitative.
are type 2 viruses. They are both infective and imitative, imi-
If x Û A, then for any i, Wg(x, i) is finite. By the definition of
tating infected programs in infinitely many computations
J(e, y), the function lz. J( g(x, i), z) does not equal a for finitely
(or environments).
many z. Hence for all y i, the function lz.J( g(x, i), z) does not
In the definition of computer viruses (Definition 4), infective
equal a only for finitely many z. Therefore if Wi is an infinite
property is a necessary condition for a computer virus. Some
set, there are infinitely many Cd, pD Û Wi satisfying cy i(F( g
illegal programs (malicious programs) which do not have in-
(x, y), Cd, pD)źa), i.e., ff ðiÞðd;pÞźfiðd;pÞ. So that n(x) Û V2.

fective property are also called computer viruses in a less strict
If x ; A, there exists a y0such that Wg(x, y) is an infinite set.
sense. These situations are not included in that definition.
Let Tź{Cd, pD:dyite recursively enumerable set. Let c>y0and WcźT, Cd, pD Û Wc
and does not equals b2. If ff ðcÞðd;pÞźfcðd;pÞ, from Eq. (24)

we have
Acknowledgements
cyThe authors wish to thank the referees for their helpful
(25)
comments that improved the article greatly.
Meanwhile, by the definition of set T, we have
Cd;pDÛT5dy y0ðJðgðx;yÞ;Cd;pDÞsaÞ: (26)
ref erences
c
That lead to a contradiction. Therefore, though W is an
Û Wc can satisfy ff ðcÞðb2Þźfcðb2Þ. Hence
2
infinite set, only b

we have n(x) ; V2 for x ; A.
Adleman LM. An abstract theory of computer viruses. In:
In conclusion, A m V2, so V2 is a P3-complete set. , Goldwasser S, editor. Advances in cryptology  CRYPTO 88.
computers & securi ty 25 ( 2006) 469 473 473
Lecture notes in computer science, vol. 403. Berlin: Springer- than 20 journal papers and conference presentations. His
Verlag; 1988. p. 354 74.
research interests are in information security, semantic web
Chang T, Shao-Ren Z. Computational model of computer virus.
and natural language processing.
Chinese Journal of Computers 2001;24(2):158 63.
Cohen F. Computational aspects of computer viruses. Computers
Qing-xin Zhu received B. Sc. degree from Sichuan Normal Uni-
& Security 1989;8(1):325 44.
versity and M. Sc. degree from Beijing Institute of Technology,
Cohen F. A short course on computer viruses. Wiley; 1994.
China. He received M. Sc. degree from Carleton University,
Jian W, Chao-Jing T, Quan Z. A computer viruses infection
model based on an expanded universal Turing machine.
Ottawa, ON, Canada, and Ph.D. degree from Ottawa University,
Journal of Computer Research and Development 2003;40(9):
Ottawa, ON, Canada. He became a faculty member of the Uni-
1300 6.
veristy of Electronic Science and Technology of China (UESTC)
Rogers HJ. Theory of recursive functions and effective comput-
in 1998. He is now a professor of College of Computer Science
ability. McGraw-Hill; 1967.
and Engineering, UESTC. His research interests include net-
Soare RI. Recursively enumerable sets and degrees. Springer-
work security, computer vision, optimal search and optimal
Verlag; 1987.
Thimbleby H, Anderson S, Cairns P. A framework for modelling control, and bioinfomatics.
trojans and computer virus infection. Computer Journal 1999;
41:444 58.
Ming-tian Zhou received E.E.B.S. degree from Harbin Insti-
Zhi-hong Z, Qing-xing Z, Ming-tian Z. On the time complexity of
tute of Technology, Harbin, China in 1962. He became a
computer viruses. IEEE Transaction on Information Theory
faculty member at the University of Electronic Science and
2005;51(8):2962 6.
Technology of China (UESTC) in 1962. He is now a professor
Zuo Z, Zhou M. Some further theoretical results about computer
of College of Computer Science and Engineering, UESTC.
viruses. Computer Journal 2004;47(6):625 33.
He has published 13 books and more than 200 papers. His
research interests include network computing, computer
Zhi-hong Zuo received the M. Eng. degree in computer
network, middleware technology, and network and
software and a Ph.D. degree in computer science both from
information security. Prof. Zhou is a fellow of China Institute
College of Computer Science and Engineering, University of
of Electronics, and a Senior Member of Federation of China
Electronic Science and Technology of China. Currently he is
Computer.
an associate professor in the college and has published more


Wyszukiwarka

Podobne podstrony:
Advantages and disadvantages of computers
Stochastic Features of Computer Viruses
The Case for Beneficial Computer Viruses and Worms
Beyerl P The Symbols And Magick of Tarot
Sequencing and Analysis of Neanderthal Genomic
readme and terms of use 3d cad models
DOD Net Centric Data Strategy and Community of Interest (COI) Training Glossary
Benefits and secrets of fasting
Ecology and behaviour of the tarantulas
Ciaran Brady The Chief Governors; The Rise and Fall of Reform Government in Tudor Ireland 1536 158
Guide to Selection and Use of Disinfectants
Causes and control of filamentous growth in aerobic granular sludge sequencing batch reactors
06?TECT AND FILTERING OF HARMONICS
Introducing the ICCNSSA Standard for Design and Construction of Storm Shelters
On demand access and delivery of business information

więcej podobnych podstron