Extend hashed symbol dictionaries to work with Ada
authorPaul N. Hilfinger <hilfinger@adacore.com>
Thu, 7 Oct 2010 06:53:44 +0000 (06:53 +0000)
committerPaul N. Hilfinger <hilfinger@adacore.com>
Thu, 7 Oct 2010 06:53:44 +0000 (06:53 +0000)
This patch allows Ada to speed up symbol lookup by using the facilities
in dictionary.[ch] for hashed lookups.  First, we generalize dictionary
search to allow clients to specify any matching function compatible with
the hashing function. Next, we modify the hashing algorithm so that symbols
that wild-match a name hash to the same value.  Finally, we modify Ada
symbol lookup to use these facilities.

Because this patch touches on a hashing algorithm used by other
languages, I took the precaution of doing a speed test on a list of
about 12000 identifiers (repeatedly inserting all of them into a table
and then doing a lookup on a million names at random, thus testing the
speed of the hashing algorithm and how well it distributed names).
There was actually a slight speedup, probably as a result of open-
coding some of the tests in msymbol_hash_iw.  By design, the revised
hashing algorithm produces the same results as the original on most
"normal" C identifiers.

We considered augmenting the dictionary interface still further by allowing
different hashing algorithms for different dictionaries, based on the
(supposed) language of the symbols in that dictionary.  While this produced
better isolation of the changes to Ada programs, the additional flexibility
also complicated the dictionary interface.  I'd prefer to keep things
simple for now.

Tested w/o regressions on Linux i686.

ChangeLog:

gdb/
* ada-lang.c (ada_match_name): Use new API for wild_match.
(wild_match): Change API to be consistent with that of strcmp_iw;
return 0 for a match, and switch operand order.
(full_match): New function.
(ada_add_block_symbols): Use dict_iter_match_{first,next} for
matching to allow use of hashing.
* dictionary.c (struct dict_vector): Generalize iter_name_first,
iter_name_next ot iter_match_first, iter_match_next.
(iter_name_first_hashed): Replace with iter_match_first_hashed.
(iter_name_next_hashed): Replace with iter_match_next_hashed.
(iter_name_first_linear): Replace with iter_match_first_linear.
(iter_name_next_linear): Replace with iter_match_next_linear.
(dict_iter_name_first): Re-implement to use dict_iter_match_first.
(dict_iter_name_next): Re-implement to use dict_iter_match_next.
(dict_iter_match_first): New function.
(dict_iter_match_next): New function.
(dict_hash): New function.
* dictionary.h (dict_iter_match_first, dict_iter_match_next): Declare.
* psymtab.c (ada_lookup_partial_symbol): Use new wild_match API.

gdb/ChangeLog
gdb/ada-lang.c
gdb/dictionary.c
gdb/dictionary.h

index a7dad2647b297109d39267a5ea0071a0efefad8c..9e5a615b76cec23056c63afc0ce9e0385bc1953b 100644 (file)
@@ -1,3 +1,25 @@
+2010-10-06  Paul Hilfinger  <hilfinger@adacore.com>
+
+       * ada-lang.c (ada_match_name): Use new API for wild_match.
+       (wild_match): Change API to be consistent with that of strcmp_iw;
+       return 0 for a match, and switch operand order.
+       (full_match): New function.
+       (ada_add_block_symbols): Use dict_iter_match_{first,next} for
+       matching to allow use of hashing.
+       * dictionary.c (struct dict_vector): Generalize iter_name_first,
+       iter_name_next ot iter_match_first, iter_match_next.
+       (iter_name_first_hashed): Replace with iter_match_first_hashed.
+       (iter_name_next_hashed): Replace with iter_match_next_hashed.
+       (iter_name_first_linear): Replace with iter_match_first_linear.
+       (iter_name_next_linear): Replace with iter_match_next_linear.
+       (dict_iter_name_first): Re-implement to use dict_iter_match_first.
+       (dict_iter_name_next): Re-implement to use dict_iter_match_next.
+       (dict_iter_match_first): New function.
+       (dict_iter_match_next): New function.
+       (dict_hash): New function.
+       * dictionary.h (dict_iter_match_first, dict_iter_match_next): Declare.
+       * psymtab.c (ada_lookup_partial_symbol): Use new wild_match API.
+
 2010-10-06  Doug Evans  <dje@google.com>
 
        * data-directory/Makefile.in: Remove @host_makefile_frag@, @frags@.
index 5fdf8511f00ffe440fdd0bc5db6994c95d8e14d0..c111e40356e0b4dc409405359f885da1fa7589a3 100644 (file)
@@ -5062,6 +5062,13 @@ wild_match (const char *name, const char *patn)
     }
 }
 
+static int
+full_match (const char *sym_name, const char *search_name)
+{
+  return !ada_match_name (sym_name, search_name, 0);
+}
+
+
 /* Add symbols from BLOCK matching identifier NAME in DOMAIN to
    vector *defn_symbols, updating the list of symbols in OBSTACKP 
    (if necessary).  If WILD, treat as NAME with a wildcard prefix. 
@@ -5086,9 +5093,9 @@ ada_add_block_symbols (struct obstack *obstackp,
   found_sym = 0;
   if (wild)
     {
-      struct symbol *sym;
-
-      ALL_BLOCK_SYMBOLS (block, iter, sym)
+      for (sym = dict_iter_match_first (BLOCK_DICT (block), name,
+                                       wild_match, &iter);
+          sym != NULL; sym = dict_iter_match_next (name, wild_match, &iter))
       {
         if (symbol_matches_domain (SYMBOL_LANGUAGE (sym),
                                    SYMBOL_DOMAIN (sym), domain)
@@ -5110,29 +5117,25 @@ ada_add_block_symbols (struct obstack *obstackp,
     }
   else
     {
-      ALL_BLOCK_SYMBOLS (block, iter, sym)
+     for (sym = dict_iter_match_first (BLOCK_DICT (block), name,
+                                       full_match, &iter);
+          sym != NULL; sym = dict_iter_match_next (name, full_match, &iter))
       {
         if (symbol_matches_domain (SYMBOL_LANGUAGE (sym),
                                    SYMBOL_DOMAIN (sym), domain))
           {
-            int cmp = strncmp (name, SYMBOL_LINKAGE_NAME (sym), name_len);
-
-            if (cmp == 0
-                && is_name_suffix (SYMBOL_LINKAGE_NAME (sym) + name_len))
-              {
-               if (SYMBOL_CLASS (sym) != LOC_UNRESOLVED)
+           if (SYMBOL_CLASS (sym) != LOC_UNRESOLVED)
+             {
+               if (SYMBOL_IS_ARGUMENT (sym))
+                 arg_sym = sym;
+               else
                  {
-                   if (SYMBOL_IS_ARGUMENT (sym))
-                     arg_sym = sym;
-                   else
-                     {
-                       found_sym = 1;
-                       add_defn_to_vec (obstackp,
-                                        fixup_symbol_section (sym, objfile),
-                                        block);
-                     }
+                   found_sym = 1;
+                   add_defn_to_vec (obstackp,
+                                    fixup_symbol_section (sym, objfile),
+                                    block);
                  }
-              }
+             }
           }
       }
     }
index e1c2010bc1c386d46bf4d821b441a9982b10ec64..f3ac3069adbea1cd70a7afca69ac4e0fdffbeb58 100644 (file)
@@ -21,6 +21,7 @@
    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
 #include "defs.h"
+#include <ctype.h>
 #include "gdb_obstack.h"
 #include "symtab.h"
 #include "buildsym.h"
@@ -116,11 +117,15 @@ struct dict_vector
                                    struct dict_iterator *iterator);
   struct symbol *(*iterator_next) (struct dict_iterator *iterator);
   /* Functions to iterate over symbols with a given name.  */
-  struct symbol *(*iter_name_first) (const struct dictionary *dict,
+  struct symbol *(*iter_match_first) (const struct dictionary *dict,
                                     const char *name,
+                                    int (*equiv) (const char *,
+                                                  const char *),
+                                    struct dict_iterator *iterator);
+  struct symbol *(*iter_match_next) (const char *name,
+                                    int (*equiv) (const char *,
+                                                  const char *),
                                     struct dict_iterator *iterator);
-  struct symbol *(*iter_name_next) (const char *name,
-                                   struct dict_iterator *iterator);
   /* A size function, for maint print symtabs.  */
   int (*size) (const struct dictionary *dict);
 };
@@ -236,12 +241,18 @@ static struct symbol *iterator_first_hashed (const struct dictionary *dict,
 
 static struct symbol *iterator_next_hashed (struct dict_iterator *iterator);
 
-static struct symbol *iter_name_first_hashed (const struct dictionary *dict,
-                                             const char *name,
+static struct symbol *iter_match_first_hashed (const struct dictionary *dict,
+                                              const char *name,
+                                              int (*compare) (const char *,
+                                                              const char *),
                                              struct dict_iterator *iterator);
 
-static struct symbol *iter_name_next_hashed (const char *name,
-                                            struct dict_iterator *iterator);
+static struct symbol *iter_match_next_hashed (const char *name,
+                                             int (*compare) (const char *,
+                                                             const char *),
+                                             struct dict_iterator *iterator);
+
+static unsigned int dict_hash (const char *string);
 
 /* Functions only for DICT_HASHED.  */
 
@@ -264,12 +275,16 @@ static struct symbol *iterator_first_linear (const struct dictionary *dict,
 
 static struct symbol *iterator_next_linear (struct dict_iterator *iterator);
 
-static struct symbol *iter_name_first_linear (const struct dictionary *dict,
-                                             const char *name,
-                                             struct dict_iterator *iterator);
+static struct symbol *iter_match_first_linear (const struct dictionary *dict,
+                                              const char *name,
+                                              int (*compare) (const char *,
+                                                              const char *),
+                                              struct dict_iterator *iterator);
 
-static struct symbol *iter_name_next_linear (const char *name,
-                                            struct dict_iterator *iterator);
+static struct symbol *iter_match_next_linear (const char *name,
+                                             int (*compare) (const char *,
+                                                             const char *),
+                                             struct dict_iterator *iterator);
 
 static int size_linear (const struct dictionary *dict);
 
@@ -289,8 +304,8 @@ static const struct dict_vector dict_hashed_vector =
     add_symbol_nonexpandable,          /* add_symbol */
     iterator_first_hashed,             /* iterator_first */
     iterator_next_hashed,              /* iterator_next */
-    iter_name_first_hashed,            /* iter_name_first */
-    iter_name_next_hashed,             /* iter_name_next */
+    iter_match_first_hashed,           /* iter_name_first */
+    iter_match_next_hashed,            /* iter_name_next */
     size_hashed,                       /* size */
   };
 
@@ -301,8 +316,8 @@ static const struct dict_vector dict_hashed_expandable_vector =
     add_symbol_hashed_expandable,      /* add_symbol */
     iterator_first_hashed,             /* iterator_first */
     iterator_next_hashed,              /* iterator_next */
-    iter_name_first_hashed,            /* iter_name_first */
-    iter_name_next_hashed,             /* iter_name_next */
+    iter_match_first_hashed,           /* iter_name_first */
+    iter_match_next_hashed,            /* iter_name_next */
     size_hashed_expandable,            /* size */
   };
 
@@ -313,8 +328,8 @@ static const struct dict_vector dict_linear_vector =
     add_symbol_nonexpandable,          /* add_symbol */
     iterator_first_linear,             /* iterator_first */
     iterator_next_linear,              /* iterator_next */
-    iter_name_first_linear,            /* iter_name_first */
-    iter_name_next_linear,             /* iter_name_next */
+    iter_match_first_linear,           /* iter_name_first */
+    iter_match_next_linear,            /* iter_name_next */
     size_linear,                       /* size */
   };
 
@@ -325,8 +340,8 @@ static const struct dict_vector dict_linear_expandable_vector =
     add_symbol_linear_expandable,      /* add_symbol */
     iterator_first_linear,             /* iterator_first */
     iterator_next_linear,              /* iterator_next */
-    iter_name_first_linear,            /* iter_name_first */
-    iter_name_next_linear,             /* iter_name_next */
+    iter_match_first_linear,           /* iter_name_first */
+    iter_match_next_linear,            /* iter_name_next */
     size_linear,                       /* size */
   };
 
@@ -516,14 +531,31 @@ dict_iter_name_first (const struct dictionary *dict,
                      const char *name,
                      struct dict_iterator *iterator)
 {
-  return (DICT_VECTOR (dict))->iter_name_first (dict, name, iterator);
+  return dict_iter_match_first (dict, name, strcmp_iw, iterator);
 }
 
 struct symbol *
 dict_iter_name_next (const char *name, struct dict_iterator *iterator)
+{
+  return dict_iter_match_next (name, strcmp_iw, iterator);
+}
+
+struct symbol *
+dict_iter_match_first (const struct dictionary *dict,
+                      const char *name,
+                      int (*compare) (const char *, const char *),
+                      struct dict_iterator *iterator)
+{
+  return (DICT_VECTOR (dict))->iter_match_first (dict, name, compare, iterator);
+}
+
+struct symbol *
+dict_iter_match_next (const char *name,
+                     int (*compare) (const char *, const char *),
+                     struct dict_iterator *iterator)
 {
   return (DICT_VECTOR (DICT_ITERATOR_DICT (iterator)))
-    ->iter_name_next (name, iterator);
+    ->iter_match_next (name, compare, iterator);
 }
 
 int
@@ -614,12 +646,12 @@ iterator_hashed_advance (struct dict_iterator *iterator)
 }
 
 static struct symbol *
-iter_name_first_hashed (const struct dictionary *dict,
-                       const char *name,
-                       struct dict_iterator *iterator)
+iter_match_first_hashed (const struct dictionary *dict,
+                        const char *name,
+                        int (*compare) (const char *, const char *),
+                        struct dict_iterator *iterator)
 {
-  unsigned int hash_index
-    = msymbol_hash_iw (name) % DICT_HASHED_NBUCKETS (dict);
+  unsigned int hash_index = dict_hash (name) % DICT_HASHED_NBUCKETS (dict);
   struct symbol *sym;
 
   DICT_ITERATOR_DICT (iterator) = dict;
@@ -632,8 +664,8 @@ iter_name_first_hashed (const struct dictionary *dict,
        sym != NULL;
        sym = sym->hash_next)
     {
-      /* Warning: the order of arguments to strcmp_iw matters!  */
-      if (strcmp_iw (SYMBOL_SEARCH_NAME (sym), name) == 0)
+      /* Warning: the order of arguments to compare matters!  */
+      if (compare (SYMBOL_SEARCH_NAME (sym), name) == 0)
        {
          break;
        }
@@ -645,7 +677,9 @@ iter_name_first_hashed (const struct dictionary *dict,
 }
 
 static struct symbol *
-iter_name_next_hashed (const char *name, struct dict_iterator *iterator)
+iter_match_next_hashed (const char *name,
+                       int (*compare) (const char *, const char *),
+                       struct dict_iterator *iterator)
 {
   struct symbol *next;
 
@@ -653,7 +687,7 @@ iter_name_next_hashed (const char *name, struct dict_iterator *iterator)
        next != NULL;
        next = next->hash_next)
     {
-      if (strcmp_iw (SYMBOL_SEARCH_NAME (next), name) == 0)
+      if (compare (SYMBOL_SEARCH_NAME (next), name) == 0)
        break;
     }
 
@@ -671,8 +705,8 @@ insert_symbol_hashed (struct dictionary *dict,
   unsigned int hash_index;
   struct symbol **buckets = DICT_HASHED_BUCKETS (dict);
 
-  hash_index = (msymbol_hash_iw (SYMBOL_SEARCH_NAME (sym))
-               % DICT_HASHED_NBUCKETS (dict));
+  hash_index = 
+    dict_hash (SYMBOL_SEARCH_NAME (sym)) % DICT_HASHED_NBUCKETS (dict);
   sym->hash_next = buckets[hash_index];
   buckets[hash_index] = sym;
 }
@@ -746,6 +780,60 @@ expand_hashtable (struct dictionary *dict)
   xfree (old_buckets);
 }
 
+/* Produce an unsigned hash value from STRING0 that is consistent
+   with strcmp_iw, strcmp, and, at least on Ada symbols, wild_match.
+   That is, two identifiers equivalent according to any of those three
+   comparison operators hash to the same value.  */
+
+static unsigned int
+dict_hash (const char *string)
+{
+  /* The Ada-encoded version of a name P1.P2...Pn has either the form
+     P1__P2__...Pn<suffix> or _ada_P1__P2__...Pn<suffix> (where the Pi
+     are lower-cased identifiers).  The <suffix> (which can be empty)
+     encodes additional information about the denoted entity.  This
+     routine hashes such names to msymbol_hash_iw(Pn).  It actually
+     does this for a superset of both valid Pi and of <suffix>, but 
+     in other cases it simply returns msymbol_hash_iw(STRING0).  */
+
+  unsigned int hash;
+  int c;
+
+  if (*string == '_' && strncmp (string, "_ada_", 5) == 0)
+    string += 5;
+
+  hash = 0;
+  while (*string)
+    {
+      switch (*string)
+       {
+       case '$':
+       case '.':
+       case 'X':
+       case '(':
+         return hash;
+       case ' ':
+         string += 1;
+         break;
+       case '_':
+         if (string[1] == '_')
+           {
+             if (((c = string[2]) < 'a' || c > 'z') && c != 'O')
+               return hash;
+             hash = 0;
+             string += 2;
+             break;
+           }
+         /* FALL THROUGH */
+       default:
+         hash = hash * 67 + *string - 113;
+         string += 1;
+         break;
+       }
+    }
+  return hash;
+}
+
 /* Functions for DICT_LINEAR and DICT_LINEAR_EXPANDABLE.  */
 
 static struct symbol *
@@ -769,18 +857,21 @@ iterator_next_linear (struct dict_iterator *iterator)
 }
 
 static struct symbol *
-iter_name_first_linear (const struct dictionary *dict,
-                       const char *name,
-                       struct dict_iterator *iterator)
+iter_match_first_linear (const struct dictionary *dict,
+                        const char *name,
+                        int (*compare) (const char *, const char *),
+                        struct dict_iterator *iterator)
 {
   DICT_ITERATOR_DICT (iterator) = dict;
   DICT_ITERATOR_INDEX (iterator) = -1;
 
-  return iter_name_next_linear (name, iterator);
+  return iter_match_next_linear (name, compare, iterator);
 }
 
 static struct symbol *
-iter_name_next_linear (const char *name, struct dict_iterator *iterator)
+iter_match_next_linear (const char *name,
+                       int (*compare) (const char *, const char *),
+                       struct dict_iterator *iterator)
 {
   const struct dictionary *dict = DICT_ITERATOR_DICT (iterator);
   int i, nsyms = DICT_LINEAR_NSYMS (dict);
@@ -789,7 +880,7 @@ iter_name_next_linear (const char *name, struct dict_iterator *iterator)
   for (i = DICT_ITERATOR_INDEX (iterator) + 1; i < nsyms; ++i)
     {
       sym = DICT_LINEAR_SYM (dict, i);
-      if (strcmp_iw (SYMBOL_SEARCH_NAME (sym), name) == 0)
+      if (compare (SYMBOL_SEARCH_NAME (sym), name) == 0)
        {
          retval = sym;
          break;
index 2242a791a185607d2f264b8449c99670bbbc1ad4..f7d30350ed7ed4a9c78a8705be120b125aead55a 100644 (file)
@@ -134,6 +134,31 @@ extern struct symbol *dict_iter_name_first (const struct dictionary *dict,
 extern struct symbol *dict_iter_name_next (const char *name,
                                           struct dict_iterator *iterator);
 
+/* Initialize ITERATOR to point at the first symbol in DICT whose
+   SYMBOL_SEARCH_NAME is NAME, as tested using COMPARE (which must use
+   the same conventions as strcmp_iw and be compatible with any
+   dictionary hashing function), and return that first symbol, or NULL
+   if there are no such symbols.  */
+
+extern struct symbol *dict_iter_match_first (const struct dictionary *dict,
+                                            const char *name,
+                                            int (*compare) (const char*, 
+                                                            const char *),
+                                            struct dict_iterator *iterator);
+
+/* Advance ITERATOR to point at the next symbol in DICT whose
+   SYMBOL_SEARCH_NAME is NAME, as tested using COMPARE (see
+   dict_iter_match_first), or NULL if there are no more such symbols.
+   Don't call this if you've previously received NULL from 
+   dict_iterator_match_first or dict_iterator_match_next on this
+   iteration. And don't call it unless ITERATOR was created by a
+   previous call to dict_iter_match_first with the same NAME and COMPARE.  */
+
+extern struct symbol *dict_iter_match_next (const char *name,
+                                           int (*compare) (const char*, 
+                                                           const char *),
+                                           struct dict_iterator *iterator);
+
 /* Return some notion of the size of the dictionary: the number of
    symbols if we have that, the number of hash buckets otherwise.  */