| Rev |
Log message |
Author |
Age |
Path |
| 3735 |
A new implementation for the open64's support for DSP Zero-Delay-Loop(ZDL) feature. This ZDL contribution is a co-operational approach of global scalar optimizer WOPT and the Code Generation component CG. The common ZDL features, i.e, the loop counting mechanism hidden and bottom loop condition abstracted are implemented in WOPT with minimized changes while keeping the maximum optimizing result. A pseudo branch op method is introduced to expand ZDL WHIRL operator and delay the ZDL hardware instruction generation to the very last phase of CG loop optimization. This lazy implementation helps keeping the compatibility with other CG loop optimizations to produce the most optimal code. With the new ZDL implementation, open64 provides the equal expressibility to gcc's doloop_begin, doloop_end and decrement_and_branch_until_zero patterns.
Pragma:
#pragma zdl off
Flags:
WOPT_Enable_ZDL
OPT_Lower_ZDL
WOPT_Enable_ZDL_Early_Exit
WOPT_ZDL_Innermost_Only
CG_zdl_enabled_level[SL]
retargeting macro:
ZDL_TARG
retargeting interface:
INT CG_LOOP_ZDL_Gen(LOOP_DESCR*); //cg_loop.cxx
void Emit_Phase_Validity_Check(void); //cgemit.cxx
void Target_Specific_ZDLBR_Expansion(TN* target_tn); //whirl2ops.cxx
provided options
-WOPT:zdl
-WOPT:zdl_skip_a
-WOPT:zdl_skip_b
-WOPT:zdl_skip_e
-OPT:lower_zdl
-CG:zdl_enabled_level[SL]
Code Review: Fred Chow, Lai Jianxin and Sun Chan. |
yug |
622d 16h |
/trunk/osprey/common/com/config_wopt.h |
| 3607 |
Add extended proactive loop optimizations: Hash nested if-compare expressions to identify loop fusion candidates; Apply if-condition distribution, if-condition tree height reduction and reversed loop unswitching to expose if-merging opportunities; Apply if-merging, proactive loop fusion and loop fusion; Add head/tail duplication of if-regions; Add bit-expression simplification and dead code removal; Add more utilities and debugging flags. CR: Sun Chan |
meiye |
738d 00h |
/trunk/osprey/common/com/config_wopt.h |
| 3576 |
SL changes merge r3231:3575. CR by Sun Chan, David Coakley and Lai Jianxian |
yug |
756d 09h |
/trunk/osprey/common/com/config_wopt.h |
| 3539 |
Addition of -WOPT:SIB=<on|off> flag and its functionality to support
Scaled-Index-Base address mode generation. Also extended pattern
matching for add sub instructions to generation inc and dec
instructions. |
mberg |
777d 23h |
/trunk/osprey/common/com/config_wopt.h |
| 3379 |
Open64's always bottom-test loop form (inversioned) are not code-size saving for those while-dos and do-loops with big condition and small body. So, we design a heuristic to filter out those cases and do normal condition headed loop lowering |
yug |
947d 07h |
/trunk/osprey/common/com/config_wopt.h |
| 3314 |
Merge changes r2717-3263 from open64-booster branch to trunk.
These changes include work done for AMD's x86 Open64 4.2.4 release. |
dcoakley |
1007d 20h |
/trunk/osprey/common/com/config_wopt.h |
| 2722 |
Merge all changes through r2711 from open64-booster branch to trunk.
These changes include work done for AMD's x86 Open64 4.2.3 release. |
dcoakley |
1253d 20h |
/trunk/osprey/common/com/config_wopt.h |
| 2322 |
Merge all changes through r2321 from open64-booster branch to trunk.
These changes include work done for AMD's x86 Open64 4.2.2 release. |
dcoakley |
1394d 18h |
/trunk/osprey/common/com/config_wopt.h |
| 1950 |
Merge branches/merge08 into trunk.
Now the trunk is the latest revision for Open64 4.2 release.
The trunk now can generate code for 5 platforms:
- IA-32/x86_64
- IA-64 (Itanium)
- CUDA (from NVIDIA)
- SL (an embedded DSP architecture, from SimpLight)
- MIPS prototype (from ICT based on input from PathScale and SimpLight
The trunk is merged with PathScale 3.2 release with a lot of enhancement and
bug fix from Tsinghua Univ., NVIDIA, SimpLight, HP and ICT. |
laijx |
1700d 05h |
/trunk/osprey/common/com/config_wopt.h |
| 1411 |
Replace all files in trunk with the merge branch.
Now the files in trunk should be the same as in the merge branch |
laijx |
1947d 09h |
/trunk/osprey/common/com/config_wopt.h |
| 1296 |
Fix the ultimate compile time bug for 403.gcc -O3 -ipa. This fix calculate the height of the IR Tree on the fly in WOPT to get accurate information to stop copy-propagation in time. We also add weight factor to prevent too much copy-propagation exploding the compile time. |
dehao |
2040d 16h |
/trunk/osprey/common/com/config_wopt.h |
| 1246 |
init implmentation of loop-multiversioning |
syang |
2080d 22h |
/trunk/osprey/common/com/config_wopt.h |
| 1047 |
Rename kpro64 to osprey. The makefile will not work in several hours |
laijx |
2185d 05h |
/trunk/osprey/common/com/config_wopt.h |
| 1024 |
part of points-to summary, not yet enabled |
syang |
2189d 01h |
/trunk/osprey/common/com/config_wopt.h |
| 1005 |
optimization about __attribute__((noreturn)) semantic |
syang |
2194d 17h |
/trunk/osprey/common/com/config_wopt.h |
| 926 |
The 'merge' branch is now our new trunk. |
ributzka |
2236d 01h |
/trunk/osprey/common/com/config_wopt.h |
| 869 |
transfer 782 and 783 from trunk. the r783 is fix to the bug in r782 |
syang |
2249d 17h |
/trunk/osprey/common/com/config_wopt.h |
| 861 |
Check in the merge of source code pathscale-3.0 developed, which including at least:
bug fixes
IPA_Enable_Source_PU_Order
DEFAULT_UNROLL_MAX 5
Prefetch Invariant (non-constant) Stride
Prefetch_stride_ahead & Prefetch_invariant_stride
Enable_compose_bits / Enable_extract_bits
perform floating-point memory copies using integer registers
Use alternate malloc algorithm
WOPT_Enable_Str_Red_Use_Context = TRUE; /* use loop content in SR decision */
IPL_Ignore_Small_Loops
proc_has_pstatics
IPA_Enable_Source_PU_Order
-IPA:ignore_lang
__builtin_expect
-fno-gnu-exceptions
You can refer to pathscale's release notes for more details. |
hucheng |
2251d 09h |
/trunk/osprey/common/com/config_wopt.h |
| 731 |
Check in the new merged files, added the Shuxin's latest checkins. |
hucheng |
2306d 13h |
/trunk/osprey/common/com/config_wopt.h |
| 725 |
Remove directory merge/trunk, added by mistake. |
hucheng |
2306d 17h |
/trunk/osprey/common/com/config_wopt.h |