Vega strike Python Modules doc  0.5.1
Documentation of the " Modules " folder of Vega strike
 All Data Structures Namespaces Files Functions Variables
SequenceMatcher Class Reference

Public Member Functions

def __init__
 
def set_seqs
 
def set_seq1
 
def set_seq2
 
def find_longest_match
 
def get_matching_blocks
 
def get_opcodes
 
def ratio
 
def quick_ratio
 
def real_quick_ratio
 

Data Fields

 isjunk
 
 a
 
 b
 
 matching_blocks
 
 opcodes
 
 fullbcount
 
 b2j
 
 b2jhas
 
 isbjunk
 

Detailed Description

SequenceMatcher is a flexible class for comparing pairs of sequences of
any type, so long as the sequence elements are hashable.  The basic
algorithm predates, and is a little fancier than, an algorithm
published in the late 1980's by Ratcliff and Obershelp under the
hyperbolic name "gestalt pattern matching".  The basic idea is to find
the longest contiguous matching subsequence that contains no "junk"
elements (R-O doesn't address junk).  The same idea is then applied
recursively to the pieces of the sequences to the left and to the right
of the matching subsequence.  This does not yield minimal edit
sequences, but does tend to yield matches that "look right" to people.

SequenceMatcher tries to compute a "human-friendly diff" between two
sequences.  Unlike e.g. UNIX(tm) diff, the fundamental notion is the
longest *contiguous* & junk-free matching subsequence.  That's what
catches peoples' eyes.  The Windows(tm) windiff has another interesting
notion, pairing up elements that appear uniquely in each sequence.
That, and the method here, appear to yield more intuitive difference
reports than does diff.  This method appears to be the least vulnerable
to synching up on blocks of "junk lines", though (like blank lines in
ordinary text files, or maybe "<P>" lines in HTML files).  That may be
because this is the only method of the 3 that has a *concept* of
"junk" <wink>.

Example, comparing two strings, and considering blanks to be "junk":

>>> s = SequenceMatcher(lambda x: x == " ",
...                     "private Thread currentThread;",
...                     "private volatile Thread currentThread;")
>>>

.ratio() returns a float in [0, 1], measuring the "similarity" of the
sequences.  As a rule of thumb, a .ratio() value over 0.6 means the
sequences are close matches:

>>> print round(s.ratio(), 3)
0.866
>>>

If you're only interested in where the sequences match,
.get_matching_blocks() is handy:

>>> for block in s.get_matching_blocks():
...     print "a[%d] and b[%d] match for %d elements" % block
a[0] and b[0] match for 8 elements
a[8] and b[17] match for 6 elements
a[14] and b[23] match for 15 elements
a[29] and b[38] match for 0 elements

Note that the last tuple returned by .get_matching_blocks() is always a
dummy, (len(a), len(b), 0), and this is the only case in which the last
tuple element (number of elements matched) is 0.

If you want to know how to change the first sequence into the second,
use .get_opcodes():

>>> for opcode in s.get_opcodes():
...     print "%6s a[%d:%d] b[%d:%d]" % opcode
 equal a[0:8] b[0:8]
insert a[8:8] b[8:17]
 equal a[8:14] b[17:23]
 equal a[14:29] b[23:38]

See the Differ class for a fancy human-friendly file differencer, which
uses SequenceMatcher both to compare sequences of lines, and to compare
sequences of characters within similar (near-matching) lines.

See also function get_close_matches() in this module, which shows how
simple code building on SequenceMatcher can be used to do useful work.

Timing:  Basic R-O is cubic time worst case and quadratic time expected
case.  SequenceMatcher is quadratic time for the worst case and has
expected-case behavior dependent in a complicated way on how many
elements the sequences have in common; best case time is linear.

Methods:

__init__(isjunk=None, a='', b='')
    Construct a SequenceMatcher.

set_seqs(a, b)
    Set the two sequences to be compared.

set_seq1(a)
    Set the first sequence to be compared.

set_seq2(b)
    Set the second sequence to be compared.

find_longest_match(alo, ahi, blo, bhi)
    Find longest matching block in a[alo:ahi] and b[blo:bhi].

get_matching_blocks()
    Return list of triples describing matching subsequences.

get_opcodes()
    Return list of 5-tuples describing how to turn a into b.

ratio()
    Return a measure of the sequences' similarity (float in [0,1]).

quick_ratio()
    Return an upper bound on .ratio() relatively quickly.

real_quick_ratio()
    Return an upper bound on ratio() very quickly.

Definition at line 27 of file difflib.py.

Constructor & Destructor Documentation

def __init__ (   self,
  isjunk = None,
  a = '',
  b = '' 
)
Construct a SequenceMatcher.

Optional arg isjunk is None (the default), or a one-argument
function that takes a sequence element and returns true iff the
element is junk.  None is equivalent to passing "lambda x: 0", i.e.
no elements are considered to be junk.  For example, pass
    lambda x: x in " \\t"
if you're comparing lines as sequences of characters, and don't
want to synch up on blanks or hard tabs.

Optional arg a is the first of two sequences to be compared.  By
default, an empty string.  The elements of a must be hashable.  See
also .set_seqs() and .set_seq1().

Optional arg b is the second of two sequences to be compared.  By
default, an empty string.  The elements of b must be hashable. See
also .set_seqs() and .set_seq2().

Definition at line 137 of file difflib.py.

138  def __init__(self, isjunk=None, a='', b=''):
139  """Construct a SequenceMatcher.
140 
141  Optional arg isjunk is None (the default), or a one-argument
142  function that takes a sequence element and returns true iff the
143  element is junk. None is equivalent to passing "lambda x: 0", i.e.
144  no elements are considered to be junk. For example, pass
145  lambda x: x in " \\t"
146  if you're comparing lines as sequences of characters, and don't
147  want to synch up on blanks or hard tabs.
148 
149  Optional arg a is the first of two sequences to be compared. By
150  default, an empty string. The elements of a must be hashable. See
151  also .set_seqs() and .set_seq1().
152 
153  Optional arg b is the second of two sequences to be compared. By
154  default, an empty string. The elements of b must be hashable. See
155  also .set_seqs() and .set_seq2().
156  """
157 
158  # Members:
159  # a
160  # first sequence
161  # b
162  # second sequence; differences are computed as "what do
163  # we need to do to 'a' to change it into 'b'?"
164  # b2j
165  # for x in b, b2j[x] is a list of the indices (into b)
166  # at which x appears; junk elements do not appear
167  # b2jhas
168  # b2j.has_key
169  # fullbcount
170  # for x in b, fullbcount[x] == the number of times x
171  # appears in b; only materialized if really needed (used
172  # only for computing quick_ratio())
173  # matching_blocks
174  # a list of (i, j, k) triples, where a[i:i+k] == b[j:j+k];
175  # ascending & non-overlapping in i and in j; terminated by
176  # a dummy (len(a), len(b), 0) sentinel
177  # opcodes
178  # a list of (tag, i1, i2, j1, j2) tuples, where tag is
179  # one of
180  # 'replace' a[i1:i2] should be replaced by b[j1:j2]
181  # 'delete' a[i1:i2] should be deleted
182  # 'insert' b[j1:j2] should be inserted
183  # 'equal' a[i1:i2] == b[j1:j2]
184  # isjunk
185  # a user-supplied function taking a sequence element and
186  # returning true iff the element is "junk" -- this has
187  # subtle but helpful effects on the algorithm, which I'll
188  # get around to writing up someday <0.9 wink>.
189  # DON'T USE! Only __chain_b uses this. Use isbjunk.
190  # isbjunk
191  # for x in b, isbjunk(x) == isjunk(x) but much faster;
192  # it's really the has_key method of a hidden dict.
193  # DOES NOT WORK for x in a!
195  self.isjunk = isjunk
196  self.a = self.b = None
197  self.set_seqs(a, b)

Member Function Documentation

def find_longest_match (   self,
  alo,
  ahi,
  blo,
  bhi 
)
Find longest matching block in a[alo:ahi] and b[blo:bhi].

If isjunk is not defined:

Return (i,j,k) such that a[i:i+k] is equal to b[j:j+k], where
    alo <= i <= i+k <= ahi
    blo <= j <= j+k <= bhi
and for all (i',j',k') meeting those conditions,
    k >= k'
    i <= i'
    and if i == i', j <= j'

In other words, of all maximal matching blocks, return one that
starts earliest in a, and of all those maximal matching blocks that
start earliest in a, return the one that starts earliest in b.

>>> s = SequenceMatcher(None, " abcd", "abcd abcd")
>>> s.find_longest_match(0, 5, 0, 9)
(0, 4, 5)

If isjunk is defined, first the longest matching block is
determined as above, but with the additional restriction that no
junk element appears in the block.  Then that block is extended as
far as possible by matching (only) junk elements on both sides.  So
the resulting block never matches on junk except as identical junk
happens to be adjacent to an "interesting" match.

Here's the same example as before, but considering blanks to be
junk.  That prevents " abcd" from matching the " abcd" at the tail
end of the second sequence directly.  Instead only the "abcd" can
match, and matches the leftmost "abcd" in the second sequence:

>>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
>>> s.find_longest_match(0, 5, 0, 9)
(1, 0, 4)

If no blocks match, return (alo, blo, 0).

>>> s = SequenceMatcher(None, "ab", "c")
>>> s.find_longest_match(0, 2, 0, 1)
(0, 0, 0)

Definition at line 314 of file difflib.py.

References SequenceMatcher.a, SequenceMatcher.b, SequenceMatcher.b2j, and SequenceMatcher.isbjunk.

315  def find_longest_match(self, alo, ahi, blo, bhi):
316  """Find longest matching block in a[alo:ahi] and b[blo:bhi].
317 
318  If isjunk is not defined:
319 
320  Return (i,j,k) such that a[i:i+k] is equal to b[j:j+k], where
321  alo <= i <= i+k <= ahi
322  blo <= j <= j+k <= bhi
323  and for all (i',j',k') meeting those conditions,
324  k >= k'
325  i <= i'
326  and if i == i', j <= j'
327 
328  In other words, of all maximal matching blocks, return one that
329  starts earliest in a, and of all those maximal matching blocks that
330  start earliest in a, return the one that starts earliest in b.
331 
332  >>> s = SequenceMatcher(None, " abcd", "abcd abcd")
333  >>> s.find_longest_match(0, 5, 0, 9)
334  (0, 4, 5)
335 
336  If isjunk is defined, first the longest matching block is
337  determined as above, but with the additional restriction that no
338  junk element appears in the block. Then that block is extended as
339  far as possible by matching (only) junk elements on both sides. So
340  the resulting block never matches on junk except as identical junk
341  happens to be adjacent to an "interesting" match.
342 
343  Here's the same example as before, but considering blanks to be
344  junk. That prevents " abcd" from matching the " abcd" at the tail
345  end of the second sequence directly. Instead only the "abcd" can
346  match, and matches the leftmost "abcd" in the second sequence:
347 
348  >>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
349  >>> s.find_longest_match(0, 5, 0, 9)
350  (1, 0, 4)
351 
352  If no blocks match, return (alo, blo, 0).
353 
354  >>> s = SequenceMatcher(None, "ab", "c")
355  >>> s.find_longest_match(0, 2, 0, 1)
356  (0, 0, 0)
357  """
358 
359  # CAUTION: stripping common prefix or suffix would be incorrect.
360  # E.g.,
361  # ab
362  # acab
363  # Longest matching block is "ab", but if common prefix is
364  # stripped, it's "a" (tied with "b"). UNIX(tm) diff does so
365  # strip, so ends up claiming that ab is changed to acab by
366  # inserting "ca" in the middle. That's minimal but unintuitive:
367  # "it's obvious" that someone inserted "ac" at the front.
368  # Windiff ends up at the same place as diff, but by pairing up
369  # the unique 'b's and then matching the first two 'a's.
370 
371  a, b, b2j, isbjunk = self.a, self.b, self.b2j, self.isbjunk
372  besti, bestj, bestsize = alo, blo, 0
373  # find longest junk-free match
374  # during an iteration of the loop, j2len[j] = length of longest
375  # junk-free match ending with a[i-1] and b[j]
376  j2len = {}
377  nothing = []
378  for i in xrange(alo, ahi):
379  # look at all instances of a[i] in b; note that because
380  # b2j has no junk keys, the loop is skipped if a[i] is junk
381  j2lenget = j2len.get
382  newj2len = {}
383  for j in b2j.get(a[i], nothing):
384  # a[i] matches b[j]
385  if j < blo:
386  continue
387  if j >= bhi:
388  break
389  k = newj2len[j] = j2lenget(j-1, 0) + 1
390  if k > bestsize:
391  besti, bestj, bestsize = i-k+1, j-k+1, k
392  j2len = newj2len
393 
394  # Now that we have a wholly interesting match (albeit possibly
395  # empty!), we may as well suck up the matching junk on each
396  # side of it too. Can't think of a good reason not to, and it
397  # saves post-processing the (possibly considerable) expense of
398  # figuring out what to do with it. In the case of an empty
399  # interesting match, this is clearly the right thing to do,
400  # because no other kind of match is possible in the regions.
401  while besti > alo and bestj > blo and \
402  isbjunk(b[bestj-1]) and \
403  a[besti-1] == b[bestj-1]:
404  besti, bestj, bestsize = besti-1, bestj-1, bestsize+1
405  while besti+bestsize < ahi and bestj+bestsize < bhi and \
406  isbjunk(b[bestj+bestsize]) and \
407  a[besti+bestsize] == b[bestj+bestsize]:
408  bestsize = bestsize + 1
409 
410  return besti, bestj, bestsize
def get_matching_blocks (   self)
Return list of triples describing matching subsequences.

Each triple is of the form (i, j, n), and means that
a[i:i+n] == b[j:j+n].  The triples are monotonically increasing in
i and in j.

The last triple is a dummy, (len(a), len(b), 0), and is the only
triple with n==0.

>>> s = SequenceMatcher(None, "abxcd", "abcd")
>>> s.get_matching_blocks()
[(0, 0, 2), (3, 2, 2), (5, 4, 0)]

Definition at line 411 of file difflib.py.

References SequenceMatcher.__helper(), SequenceMatcher.a, SequenceMatcher.b, SequenceMatcher.find_longest_match(), and SequenceMatcher.matching_blocks.

412  def get_matching_blocks(self):
413  """Return list of triples describing matching subsequences.
414 
415  Each triple is of the form (i, j, n), and means that
416  a[i:i+n] == b[j:j+n]. The triples are monotonically increasing in
417  i and in j.
418 
419  The last triple is a dummy, (len(a), len(b), 0), and is the only
420  triple with n==0.
421 
422  >>> s = SequenceMatcher(None, "abxcd", "abcd")
423  >>> s.get_matching_blocks()
424  [(0, 0, 2), (3, 2, 2), (5, 4, 0)]
425  """
426 
427  if self.matching_blocks is not None:
428  return self.matching_blocks
429  self.matching_blocks = []
430  la, lb = len(self.a), len(self.b)
431  self.__helper(0, la, 0, lb, self.matching_blocks)
432  self.matching_blocks.append( (la, lb, 0) )
433  return self.matching_blocks
def get_opcodes (   self)
Return list of 5-tuples describing how to turn a into b.

Each tuple is of the form (tag, i1, i2, j1, j2).  The first tuple
has i1 == j1 == 0, and remaining tuples have i1 == the i2 from the
tuple preceding it, and likewise for j1 == the previous j2.

The tags are strings, with these meanings:

'replace':  a[i1:i2] should be replaced by b[j1:j2]
'delete':   a[i1:i2] should be deleted.
    Note that j1==j2 in this case.
'insert':   b[j1:j2] should be inserted at a[i1:i1].
    Note that i1==i2 in this case.
'equal':    a[i1:i2] == b[j1:j2]

>>> a = "qabxcd"
>>> b = "abycdf"
>>> s = SequenceMatcher(None, a, b)
>>> for tag, i1, i2, j1, j2 in s.get_opcodes():
...    print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
...           (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
 delete a[0:1] (q) b[0:0] ()
  equal a[1:3] (ab) b[0:2] (ab)
replace a[3:4] (x) b[2:3] (y)
  equal a[4:6] (cd) b[3:5] (cd)
 insert a[6:6] () b[5:6] (f)

Definition at line 449 of file difflib.py.

References SequenceMatcher.get_matching_blocks(), and SequenceMatcher.opcodes.

450  def get_opcodes(self):
451  """Return list of 5-tuples describing how to turn a into b.
452 
453  Each tuple is of the form (tag, i1, i2, j1, j2). The first tuple
454  has i1 == j1 == 0, and remaining tuples have i1 == the i2 from the
455  tuple preceding it, and likewise for j1 == the previous j2.
456 
457  The tags are strings, with these meanings:
458 
459  'replace': a[i1:i2] should be replaced by b[j1:j2]
460  'delete': a[i1:i2] should be deleted.
461  Note that j1==j2 in this case.
462  'insert': b[j1:j2] should be inserted at a[i1:i1].
463  Note that i1==i2 in this case.
464  'equal': a[i1:i2] == b[j1:j2]
465 
466  >>> a = "qabxcd"
467  >>> b = "abycdf"
468  >>> s = SequenceMatcher(None, a, b)
469  >>> for tag, i1, i2, j1, j2 in s.get_opcodes():
470  ... print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
471  ... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
472  delete a[0:1] (q) b[0:0] ()
473  equal a[1:3] (ab) b[0:2] (ab)
474  replace a[3:4] (x) b[2:3] (y)
475  equal a[4:6] (cd) b[3:5] (cd)
476  insert a[6:6] () b[5:6] (f)
477  """
478 
479  if self.opcodes is not None:
480  return self.opcodes
481  i = j = 0
482  self.opcodes = answer = []
483  for ai, bj, size in self.get_matching_blocks():
484  # invariant: we've pumped out correct diffs to change
485  # a[:i] into b[:j], and the next matching block is
486  # a[ai:ai+size] == b[bj:bj+size]. So we need to pump
487  # out a diff to change a[i:ai] into b[j:bj], pump out
488  # the matching block, and move (i,j) beyond the match
489  tag = ''
490  if i < ai and j < bj:
491  tag = 'replace'
492  elif i < ai:
493  tag = 'delete'
494  elif j < bj:
495  tag = 'insert'
496  if tag:
497  answer.append( (tag, i, ai, j, bj) )
498  i, j = ai+size, bj+size
499  # the list of matching blocks is terminated by a
500  # sentinel with size 0
501  if size:
502  answer.append( ('equal', ai, i, bj, j) )
503  return answer
def quick_ratio (   self)
Return an upper bound on ratio() relatively quickly.

This isn't defined beyond that it is an upper bound on .ratio(), and
is faster to compute.

Definition at line 530 of file difflib.py.

References SequenceMatcher.a, SequenceMatcher.b, and SequenceMatcher.fullbcount.

531  def quick_ratio(self):
532  """Return an upper bound on ratio() relatively quickly.
533 
534  This isn't defined beyond that it is an upper bound on .ratio(), and
535  is faster to compute.
536  """
537 
538  # viewing a and b as multisets, set matches to the cardinality
539  # of their intersection; this counts the number of matches
540  # without regard to order, so is clearly an upper bound
541  if self.fullbcount is None:
542  self.fullbcount = fullbcount = {}
543  for elt in self.b:
544  fullbcount[elt] = fullbcount.get(elt, 0) + 1
545  fullbcount = self.fullbcount
546  # avail[x] is the number of times x appears in 'b' less the
547  # number of times we've seen it in 'a' so far ... kinda
548  avail = {}
549  availhas, matches = avail.has_key, 0
550  for elt in self.a:
551  if availhas(elt):
552  numb = avail[elt]
553  else:
554  numb = fullbcount.get(elt, 0)
555  avail[elt] = numb - 1
556  if numb > 0:
557  matches = matches + 1
558  return 2.0 * matches / (len(self.a) + len(self.b))
def ratio (   self)
Return a measure of the sequences' similarity (float in [0,1]).

Where T is the total number of elements in both sequences, and
M is the number of matches, this is 2,0*M / T.
Note that this is 1 if the sequences are identical, and 0 if
they have nothing in common.

.ratio() is expensive to compute if you haven't already computed
.get_matching_blocks() or .get_opcodes(), in which case you may
want to try .quick_ratio() or .real_quick_ratio() first to get an
upper bound.

>>> s = SequenceMatcher(None, "abcd", "bcde")
>>> s.ratio()
0.75
>>> s.quick_ratio()
0.75
>>> s.real_quick_ratio()
1.0

Definition at line 504 of file difflib.py.

References SequenceMatcher.a, SequenceMatcher.b, and SequenceMatcher.get_matching_blocks().

505  def ratio(self):
506  """Return a measure of the sequences' similarity (float in [0,1]).
507 
508  Where T is the total number of elements in both sequences, and
509  M is the number of matches, this is 2,0*M / T.
510  Note that this is 1 if the sequences are identical, and 0 if
511  they have nothing in common.
512 
513  .ratio() is expensive to compute if you haven't already computed
514  .get_matching_blocks() or .get_opcodes(), in which case you may
515  want to try .quick_ratio() or .real_quick_ratio() first to get an
516  upper bound.
517 
518  >>> s = SequenceMatcher(None, "abcd", "bcde")
519  >>> s.ratio()
520  0.75
521  >>> s.quick_ratio()
522  0.75
523  >>> s.real_quick_ratio()
524  1.0
525  """
526 
527  matches = reduce(lambda sum, triple: sum + triple[-1],
528  self.get_matching_blocks(), 0)
529  return 2.0 * matches / (len(self.a) + len(self.b))
def real_quick_ratio (   self)
Return an upper bound on ratio() very quickly.

This isn't defined beyond that it is an upper bound on .ratio(), and
is faster to compute than either .ratio() or .quick_ratio().

Definition at line 559 of file difflib.py.

References SequenceMatcher.a, SequenceMatcher.b, and sre_parse.min.

560  def real_quick_ratio(self):
561  """Return an upper bound on ratio() very quickly.
562 
563  This isn't defined beyond that it is an upper bound on .ratio(), and
564  is faster to compute than either .ratio() or .quick_ratio().
565  """
566 
567  la, lb = len(self.a), len(self.b)
568  # can't have more matches than the number of elements in the
569  # shorter sequence
570  return 2.0 * min(la, lb) / (la + lb)
def set_seq1 (   self,
  a 
)
Set the first sequence to be compared.

The second sequence to be compared is not changed.

>>> s = SequenceMatcher(None, "abcd", "bcde")
>>> s.ratio()
0.75
>>> s.set_seq1("bcde")
>>> s.ratio()
1.0
>>>

SequenceMatcher computes and caches detailed information about the
second sequence, so if you want to compare one sequence S against
many sequences, use .set_seq2(S) once and call .set_seq1(x)
repeatedly for each of the other sequences.

See also set_seqs() and set_seq2().

Definition at line 210 of file difflib.py.

References SequenceMatcher.a.

211  def set_seq1(self, a):
212  """Set the first sequence to be compared.
213 
214  The second sequence to be compared is not changed.
215 
216  >>> s = SequenceMatcher(None, "abcd", "bcde")
217  >>> s.ratio()
218  0.75
219  >>> s.set_seq1("bcde")
220  >>> s.ratio()
221  1.0
222  >>>
223 
224  SequenceMatcher computes and caches detailed information about the
225  second sequence, so if you want to compare one sequence S against
226  many sequences, use .set_seq2(S) once and call .set_seq1(x)
227  repeatedly for each of the other sequences.
228 
229  See also set_seqs() and set_seq2().
230  """
231 
232  if a is self.a:
233  return
234  self.a = a
235  self.matching_blocks = self.opcodes = None
def set_seq2 (   self,
  b 
)
Set the second sequence to be compared.

The first sequence to be compared is not changed.

>>> s = SequenceMatcher(None, "abcd", "bcde")
>>> s.ratio()
0.75
>>> s.set_seq2("abcd")
>>> s.ratio()
1.0
>>>

SequenceMatcher computes and caches detailed information about the
second sequence, so if you want to compare one sequence S against
many sequences, use .set_seq2(S) once and call .set_seq1(x)
repeatedly for each of the other sequences.

See also set_seqs() and set_seq1().

Definition at line 236 of file difflib.py.

References SequenceMatcher.b, SequenceMatcher.matching_blocks, and SequenceMatcher.opcodes.

237  def set_seq2(self, b):
238  """Set the second sequence to be compared.
239 
240  The first sequence to be compared is not changed.
241 
242  >>> s = SequenceMatcher(None, "abcd", "bcde")
243  >>> s.ratio()
244  0.75
245  >>> s.set_seq2("abcd")
246  >>> s.ratio()
247  1.0
248  >>>
249 
250  SequenceMatcher computes and caches detailed information about the
251  second sequence, so if you want to compare one sequence S against
252  many sequences, use .set_seq2(S) once and call .set_seq1(x)
253  repeatedly for each of the other sequences.
254 
255  See also set_seqs() and set_seq1().
256  """
257 
258  if b is self.b:
259  return
260  self.b = b
261  self.matching_blocks = self.opcodes = None
262  self.fullbcount = None
263  self.__chain_b()
def set_seqs (   self,
  a,
  b 
)
Set the two sequences to be compared.

>>> s = SequenceMatcher()
>>> s.set_seqs("abcd", "bcde")
>>> s.ratio()
0.75

Definition at line 198 of file difflib.py.

References SequenceMatcher.set_seq1(), and SequenceMatcher.set_seq2().

199  def set_seqs(self, a, b):
200  """Set the two sequences to be compared.
201 
202  >>> s = SequenceMatcher()
203  >>> s.set_seqs("abcd", "bcde")
204  >>> s.ratio()
205  0.75
206  """
207 
208  self.set_seq1(a)
209  self.set_seq2(b)

Field Documentation

a

Definition at line 195 of file difflib.py.

b

Definition at line 195 of file difflib.py.

b2j

Definition at line 287 of file difflib.py.

b2jhas

Definition at line 288 of file difflib.py.

fullbcount

Definition at line 261 of file difflib.py.

isbjunk

Definition at line 312 of file difflib.py.

isjunk

Definition at line 194 of file difflib.py.

matching_blocks

Definition at line 234 of file difflib.py.

opcodes

Definition at line 234 of file difflib.py.


The documentation for this class was generated from the following file: