nir/dead_cf: delete code that's unreachable due to jumps

[mesa.git] / src / glsl / README
diff --git a/src/glsl/README b/src/glsl/README

index 74520321b21fad9c1169871b5531106f92f85104..bfcf69f903af13237e82c9bb81c0f5ff4433a046 100644 (file)
--- a/src/glsl/README
+++ b/src/glsl/README
@@ -8,7 +8,7 @@ passed straight through.  See glcpp/*
  
  2) lex and yacc-based parser takes the preprocessed string and
  generates the AST (abstract syntax tree).  Almost no checking is
  
  2) lex and yacc-based parser takes the preprocessed string and
  generates the AST (abstract syntax tree).  Almost no checking is
-performed in this stage.  See glsl_lexer.lpp and glsl_parser.ypp.
+performed in this stage.  See glsl_lexer.ll and glsl_parser.yy.
  
  3) The AST is converted to "HIR".  This is the intermediate
  representation of the compiler.  Constructors are generated, function
  
  3) The AST is converted to "HIR".  This is the intermediate
  representation of the compiler.  Constructors are generated, function
@@ -34,7 +34,7 @@ linked in.
  
  7) The driver performs code generation out of the IR, taking a linked
  shader program and producing a compiled program for each stage.  See
  
  7) The driver performs code generation out of the IR, taking a linked
  shader program and producing a compiled program for each stage.  See
-ir_to_mesa.cpp for Mesa IR code generation.
+../mesa/program/ir_to_mesa.cpp for Mesa IR code generation.
  
  FAQ:
  
  
  FAQ:
  
@@ -126,7 +126,7 @@ optimizations like CSE where one must navigate an expression tree.
  
  Q: Why no SSA representation?
  
  
  Q: Why no SSA representation?
  
-A: Converting an IR tree to SSA form makes dead code elmimination,
+A: Converting an IR tree to SSA form makes dead code elimination,
  common subexpression elimination, and many other optimizations much
  easier.  However, in our primarily vector-based language, there's some
  major questions as to how it would work.  Do we do SSA on the scalar
  common subexpression elimination, and many other optimizations much
  easier.  However, in our primarily vector-based language, there's some
  major questions as to how it would work.  Do we do SSA on the scalar
@@ -134,9 +134,9 @@ or vector level?  If we do it at the vector level, we're going to end
  up with many different versions of the variable when encountering code
  like:
  
  up with many different versions of the variable when encountering code
  like:
  
-(assign (constant bool (1)) (swiz x (var_ref __retval) ) (var_ref a) ) 
-(assign (constant bool (1)) (swiz y (var_ref __retval) ) (var_ref b) ) 
-(assign (constant bool (1)) (swiz z (var_ref __retval) ) (var_ref c) ) 
+(assign (constant bool (1)) (swiz x (var_ref __retval) ) (var_ref a) )
+(assign (constant bool (1)) (swiz y (var_ref __retval) ) (var_ref b) )
+(assign (constant bool (1)) (swiz z (var_ref __retval) ) (var_ref c) )
  
  If every masked update of a component relies on the previous value of
  the variable, then we're probably going to be quite limited in our
  
  If every masked update of a component relies on the previous value of
  the variable, then we're probably going to be quite limited in our
@@ -156,10 +156,10 @@ for the 965 fragment shader backend when that is developed.
  Q: How should I expand instructions that take multiple backend instructions?
  
  Sometimes you'll have to do the expansion in your code generation --
  Q: How should I expand instructions that take multiple backend instructions?
  
  Sometimes you'll have to do the expansion in your code generation --
-see, for example, ir_to_mesa.cpp's handling of ir_binop_mul for
-matrices.  However, in many cases you'll want to do a pass over the IR
-to convert non-native instructions to a series of native instructions.
-For example, for the Mesa backend we have ir_div_to_mul_rcp.cpp because
+see, for example, ir_to_mesa.cpp's handling of ir_unop_sqrt.  However,
+in many cases you'll want to do a pass over the IR to convert
+non-native instructions to a series of native instructions.  For
+example, for the Mesa backend we have ir_div_to_mul_rcp.cpp because
  Mesa IR (and many hardware backends) only have a reciprocal
  instruction, not a divide.  Implementing non-native instructions this
  way gives the chance for constant folding to occur, so (a / 2.0)
  Mesa IR (and many hardware backends) only have a reciprocal
  instruction, not a divide.  Implementing non-native instructions this
  way gives the chance for constant folding to occur, so (a / 2.0)
@@ -177,14 +177,52 @@ ir_unop_fract was added.  The following areas need updating to add a
  new expression type:
  
  ir.h (new enum)
  new expression type:
  
  ir.h (new enum)
-ir.cpp:get_num_operands() (used for ir_reader)
  ir.cpp:operator_strs (used for ir_reader)
  ir_constant_expression.cpp (you probably want to be able to constant fold)
  ir.cpp:operator_strs (used for ir_reader)
  ir_constant_expression.cpp (you probably want to be able to constant fold)
+ir_validate.cpp (check users have the right types)
  
  You may also need to update the backends if they will see the new expr type:
  
  
  You may also need to update the backends if they will see the new expr type:
  
-../mesa/shaders/ir_to_mesa.cpp
+../mesa/program/ir_to_mesa.cpp
  
  You can then use the new expression from builtins (if all backends
  would rather see it), or scan the IR and convert to use your new
  
  You can then use the new expression from builtins (if all backends
  would rather see it), or scan the IR and convert to use your new
-expression type (see ir_mod_to_fract, for example).
+expression type (see ir_mod_to_floor, for example).
+
+Q: How is memory management handled in the compiler?
+
+The hierarchical memory allocator "talloc" developed for the Samba
+project is used, so that things like optimization passes don't have to
+worry about their garbage collection so much.  It has a few nice
+features, including low performance overhead and good debugging
+support that's trivially available.
+
+Generally, each stage of the compile creates a talloc context and
+allocates its memory out of that or children of it.  At the end of the
+stage, the pieces still live are stolen to a new context and the old
+one freed, or the whole context is kept for use by the next stage.
+
+For IR transformations, a temporary context is used, then at the end
+of all transformations, reparent_ir reparents all live nodes under the
+shader's IR list, and the old context full of dead nodes is freed.
+When developing a single IR transformation pass, this means that you
+want to allocate instruction nodes out of the temporary context, so if
+it becomes dead it doesn't live on as the child of a live node.  At
+the moment, optimization passes aren't passed that temporary context,
+so they find it by calling talloc_parent() on a nearby IR node.  The
+talloc_parent() call is expensive, so many passes will cache the
+result of the first talloc_parent().  Cleaning up all the optimization
+passes to take a context argument and not call talloc_parent() is left
+as an exercise.
+
+Q: What is the file naming convention in this directory?
+
+Initially, there really wasn't one.  We have since adopted one:
+
+ - Files that implement code lowering passes should be named lower_*
+   (e.g., lower_noise.cpp).
+ - Files that implement optimization passes should be named opt_*.
+ - Files that implement a class that is used throught the code should
+   take the name of that class (e.g., ir_hierarchical_visitor.cpp).
+ - Files that contain code not fitting in one of the previous
+   categories should have a sensible name (e.g., glsl_parser.yy).